CANADIAN 
OURNAL OF MATHEMATICS 


Journal Canadien de Mathématiques 
VOL. XI - NO. 2 saints 


1959 OF MICHIGAN 
APR 20 1959 


MATHEMATICS 


LIBRARY. 
Some remarks on prime factors of integers P. Erdés 161 


On the representation of functions as Fourier 

Transforms P. G. Rooney 
The elliptic integrals of the third kind E. H. Neville 
Mixed problems for hyperbolic equations of 

general order G. F. D. Duff 
An improved result concerning singular manifolds 

of difference polynomials R. M. Cohn 
Subspaces of a generalized metric space H. A. Eliopoulos 
On the irreducibility of convex bodies A. C. Woods 


Dense subgraphs and connectivity R. E. Nettleton, K. Goldberg, 
and M. §. Green 

The term and stochastic ranks of a matrix A. L. Dulmage 
and N. S. Mendelsohn 


Disjoint transversals of subsets P. J. Higgins 


Separation and approximation in topological 
vector lattices S. Leader 


Tensor products of Banach algebras B. R. Gelbaum 
A class of solvable groups D. Goregstein and I. N. Herstein 


Published for 
THE CANADIAN MATHEMATICAL CONGRESS 
by the 


University of Toronto Press 





EDITORIAL BOARD 


H. S. M. Coxeter, G. F. D. Duff, R. D. James, R. L. Jeffery, 
J..M. Maranda, G. de B. Robinson, H. Zassenhaus 


with the co-operation of 


A. D. Alexandrov, R. Brauer, W. P. Brown, D. B. DeLury, J. Dixmier, 
P. Hall, N. S. Mendelsohn, P. Scherk, J. L. Synge, A. W. Tucker, 
W. J. Webber, M. Wyman 


The chief languages of the Journal are English and French. 


Manuscripts for publication in the Journal should be sent to the 
Editor-in-Chief, G. F. D. Duff, University of Toronto. Authors are 
asked to write with a sense of perspective and as clearly as possible, 
especially in the introduction. Regarding typographical conventions, 
attention is drawn to the Author’s Manual of which a copy will be 
furnished on request. 


All other correspondence should be addressed to the Managing 
Editor, G. de B. Robinson, University of Toronto. 


The Journal is published quarterly. Subscriptions should be sent 
to the Managing Editor. The price per volume of four numbers 
is $8.00. This is reduced to $4.00 for individual members of recognized 
Mathematical Societies. 


The Canadian Mathematical Congress gratefully acknowledges the 
assistance of the following towards the cost of publishing this Journal: 


University of Alberta Assumption University 
University of British Columbia Carleton College 
Dalhousie University Ecole Polytechnique 
Université Laval Loyola College 
University of Manitoba McGill University 
McMaster University Université de Montréal 
Queen’s University Royal Military College 
St. Mary's University University of Toronto 

National Research Council of Canada 

and the 
American Mathematical Society 


AUTHORIZED AS SECOND CLASS MAIL, POST OFFICE DEPARTMENT, OTTAWA 











SOME REMARKS ON PRIME FACTORS OF INTEGERS 
P. ERDOS ° 


1. Let 1 <a; < a, <... be a sequence of integers and let N(x) denote 
the number of a’s not exceeding x. If N(x)/x tends to a limit as x tends to 
infinity we say that the a’s have a density. Often one calls it the asymptotic 
density to distinguish it from the Schnirelmann or arithmetical density. The 
statement that almost all integers have a certain property will mean that 
the integers which do not have this property have density 0. Throughout 
this paper ~, g,r will denote primes. 

I conjectured for a long time that, if « > 0 is any given number, then 
almost all integers » have two divisors d; and d;, satisfying 


(1) d, <d, < (1+ 6 d. 


I proved (1, p. 691) that the integers with two divisors satisfying (1) 
have a density, but I cannot prove that this density has the value 1. How- 
ever, analogous questions can be asked about the prime divisors of integers 
and a more complete result is contained in the following theorem. 


THEOREM 1. Let «, > 0, 6, = «, if ¢, < 1 and 5, = 1 if «, > 1. The diver- 
gence of >-,5,/p is a necessary and sufficient condition that almost all integers 
should have two prime factors p and q satisfying 
(2) p<q<p”. 

From the prime number theorem we have 

Pn = (1 + o(1))n log n; 


thus >°,«,p~' will diverge if «, = (log log p)“, but will converge if «, = 
(log log p)—'~*, for any c > 0. 
Further, we shall outline a proof of 


THEOREM 2. The density of integers n which have two prime factors p and q 
satisfying 
p < q < gs log n 
equals 1 — e~*. 
Let pi < p2 <... < p, be the distinct prime factors of m. Define the 


real number m by ~;% = fi4:. A famous result of Hardy and Ramanujan 
(4) asserts that k = (1 + 0(1)) log log m for almost all n. I proved (2, p. 


Received April 23, 1958. 











162 P. ERDOS 


533, Theorem 9) that, for almost all m, the number of y's not exceeding 
t (t¢> 1) is 


(1 + o(1))(1 _ 1) log log n. 


Theorem 2 can be stated as follows: the density of integers with 


; c 
min 9; < 1+ " 
1< i<k log log 
is | — e~*. By similar methods, we can prove that the density of integers n 
satisfying 


max 9; > ¢c log log n 
1< i<k 
is 1 — exp|[— 1/c]. Further, we can prove that the divergence of >,4,/p 
(6, < 1) is the necessary and sufficient condition that almost all integers n 
should have a prime factor p such that » = 0 (mod p), and nm # 0 (mod gq) 
for all primes with 
a=! 
P<S9<?P”. 
We shall not give the proof of these results, since they are similar to those 
of Theorems | and 2. 


2. First, we show that the condition of Theorem | is necessary. In fact, 
we show that if >°,4,/p < ©, then the upper density of integers having two 
prime divisors satisfying (2) is less than one. Since }-,4,p°"! < ©, it is clear 


that 
by #' < ©. 
tp> 1 
Denote by 5; < b. <... the integers consisting of the primes p satisfying 


€, > | and the integers of the form pg, where «, < 1 and p <q < p't»#. 
Clearly the integers not divisible by any 6 have no divisor of the form pg 
satisfying (2). But }(b;-' < ©; thus by a well-known and simple argument 
(3, p. 279) one can show that the density of integers divisible by a 6 is less 
than one. We really only proved that if >>,5,/p < 1 then the upper density 
of integers having a divisor of the form pg satisfying (2) is less than one. In 
fact it would be quite easy to show that the density in question exists. 

Now we prove the sufficiency of Theorem 1. We first show that it will 
suffice to prove the following 


THEOREM I’. Let «, < }, ¢, +0, Sp€,/p = ©. Then the density of integers n 
having two prime divisors p and q satisfying 


P<qd<P 


1+ €p 
is 1. 


To deduce the sufficiency of the condition of Theorem 1 from Theorem 1’ 
it will suffice to show that if >>,5,/p = © there always exists an e,’ < €), 





— 











an 


Bi 


th 











PRIME FACTORS OF INTEGERS 163 


¢, < 4, Xe, /p = ©. To show this we observe that if >,5,/p = © then 
either there exists a subsequence p, with 


Za Pi = ©, €ps < i 
and then we put 

5 = Epi, l<qi< o, 
¢, = Oif p # p,, or for a certain 

c>h Lp =o. 


But in this case there clearly exists an e,’ < ¢, such that 
3 Pp ? 


6-0, & <}, ) he ©, 


which completes our proof. 


Now we prove Theorem 1’. Put 
yb dng 7AM 
paz P p<qcp'+ ee J 


then, since }°,¢,/p = @, 
A(x) @ as xo, 


We have to show that almost all integers have at least one divisor of the 
form pg, where p < g < p'**. Instead of this we shall prove the stronger 
result that if f(”) denotes the number of divisors of nm of the above form 
then, for almost all 7, 


(3) f(m) = (1 + o(1))A(n). 


Or, because of the slow growth of A(m), we shall in fact prove that 

(4) f(n) = (1 + o(1))A(x), 

except for o(x) values of m < x. It is easy to see that (3) and (4) are equivalent 
since 


(5) Ate) - AG) = SF ! - 2 tA 


ricp- z p p<q<p!* tp q ri<p< z p 


by the well-known estimate 


= o(1) 


| 
, p -_ log log x + ¢, +O ( -) ; 


pez 
To prove (4) we shall use ‘Turan’s method (6, pp. 274-6). We have 


Zz 


(6) > (f(n) — A(x))? = x(A(x))*? — 2A (x) > f(n) +O f'(n). 
n=1 n=! 


Since 











164 P. ERDOS 


fm)= 2D 1, 
pain 
p<q<p'* *p 
we may write 


2 Lim=-o > B =D Yt +00), 


n=l paz p<e<pit+e> LPg > v<e<pit tr PF 


where the dash indicates that pg < x.* Now e, < } implies that for p < x#, 
pq <x 


, 


Aey<X ¥ < A(x). 


Pp PpK<e<p't& r 
Thus from (5), 
(8) d s(n) = xA(x) + O(x). 
Similarly, 
= . af 
@) us _* » Dar: | + u u Linc sar: 


where in the second sum 


Pi<gqi < pi P2 < q2 < pr 


Pig: * Pode, and ({p191, P2g2} denotes the least common multiple of pig: and 


242). 
The first sum on the right of (9) is (1 + o(1))xA(x). For the second sum 


we have 


(10) > Fre sal" >> >i gy oe 


where the dash indicates that pig: # poge and {p19g1, Pog2} < x. Clearly, from 
(5), if pi < x", po < o {piqi, P2q2} < x 


oe ££ 


—_ “ae 





+ O(x), 


> (A(x"))? + O(1) = (1 + o(1))(A(x))" 


a saI P2q2 } 
On the other hand, by a — argument, 
1 
12 en 4 - = 
ea x x {Pin, es } ‘@) + > rites 
where in >>” 
n<nm<n, rs < max (r,'*"", re°**e), 


or 73 < 7:7, and 7; < x. (12) follows from the fact that rirers = {f191, Pogo} 
has four solutions. Now 


*Since p < gq, the equation pg = A has at most one solution p, g and so there are at most x 
terms in the double sum. Hence the error in omitting the square brackets is at most x. 











= OC. me AA 





— 





PRIME FACTORS OF INTEGERS 165 





ES < Ei FS ft Sl <cem, 


Tifa s paz p p<e<p'*+p J pcr<p? 7 


hence 


(13) i arpa 7 (i + o(t)A*G). 


Thus, by (9), (10), and (13), 


(14) > f'(m) = (1 + 0(1))x(A(x))’. 


n=l 


Hence from (6), (8), and (14) 
LD (s(n) — A(x)? = 0(xA*(x)), 


which proves that f(m) = (1 + 0(1))A(x), except for o(x) values of m < x. 
Thus Theorem 1 is proved. 


3. Now we outline the proof of Theorem 2. Denote by a; < a, <... 
< a <x, the integers not exceeding x of the form pg, where 


p < q < te log s 


Clearly the a’s depend on x and a, — © as x tends to infinity. Denote by 
N.(a1, ..., @; x) the number of integers not exceeding x which are not 
divisible by any of the a,’s. Further, denote by M,(x) the number of integers 
n< x which do not have two prime factors p and gq satisfying 


p < q < gras log . 
We have to prove that 
(15) M(x) = (1 + o(1))e~* x. 


Clearly 
M,.(x) < N.(a, a2, “**9 ax; x), 


but because of the slow increase of log log m it is easy to see that 
M,(x) = N,(ai, @2,... , Ge; X) + 0(x). 

Thus to prove Theorem 2 it will suffice to show that 

(16) N (ay, G2, ..., Qe; x) = xe + o(x). 


We obtain by a simple sieve process the well-known formula 


k 
N (a1, G2,..., 443%) = Xx a (—1)' , oe 


where 


oe ae ae a’ ee 











166 P. ERDOS 


where 7, 72, . . . , 4; runs through all distinct /-tuples from 1 to k. (The curly 
bracket in the denominator denotes least common multiple.) 
By a well-known combinatorial argument* 


21-1 


(17) =x > ((-1) z;) & N(@,, ...,42:%) <2 > ((-1) z;) ‘ 


l=0 


for every t > 0. We evidently have, by a simple computation (the dashes 
indicate that p < q < p'**/©"8 and pg < x) 


1 (1+ o(1))e 1 —_ ; 
ae » + o(1) =¢ + o(1), 


a 


(is) = = c= 


i=1 


by the estimate for }>,., p~'. Further, for every fixed / (the two dashes in- 
dicate that 


{@4,, Gis, Ay} < x) 
x a 
@9) a= lige] == 2 Gay tO) 


since there are only o(x) /-tuples satisfying 
Pin crse@es <& 


This last statement follows from the fact that the integers 


ia, -- +p Gal 


have at most 2/ prime factors and, by a well-known theorem of Landau 
(5, Vol. I, pp. 208-11), the number of integers not exceeding x having 2/ 
prime factors equals 


— 
x (log log x)” 
1 + o(1) ; — = o(x), 
( ) fox x (2/ — 1)! 
and finally a simple argument shows that the number of solutions of 
-eHel 


is less than a constant depending only on /. 
Now we outline the proof of 


” I 
P 
(20) Lr = 7 + (1). 

For / = 1, (20) follows from (18). For / > 1 we can prove (20) by a simple 
induction process, similar but a bit more complicated than that used in the 
estimations in Theorem 1. We do not give the details since they are somewhat 
cumbersome. 


*This is one of the basic ideas of Brun’s method, see for example, Landau Zahlentheorie, 


Vol. 1, Kap. 2. 

















PRIME FACTORS OF INTEGERS 167 
From (17) and (20) we have 
(21) N(a,,..-,@:3x) =x >> {=e + o(x) = xe~* + o(x), 
which is (16). 


REFERENCES 


1. P. Erdés, On the density of some sequences of integers, Bull. Amer. Math. Soc., 54 (1948), 


691. 

2. ——, Some remarks about additive and multiplicative functions, Bull. Amer. Math. Soc., 
52 (1946), 533, theorem 9. 

3. ————, On the density of the abundant numbers, }. Lond. Math. Soc., 9 (1934), 279 


4. G. H. Hardy and S. Ramanujan, The normal number of prime factors of n, Quart. J. Math., 
48 (1917), 76-92. 

5. E. Landau, Verteilung der Primzahlen, 

6. P. Turan, On a theorem of Hardy and Ramanujan, J. Lond. Math. Soc., 9 (1934), 274-6. 


University of Alberta 











ON THE REPRESENTATION OF FUNCTIONS AS 
FOURIER TRANSFORMS 


P. G. ROONEY 


Iffé€ L,(—@, @), 1 < p < 2, then f has a Fourier-Plancherel transform 
FE L,(—@, @) where p-' + ¢"' = 1. Also if |x|?" f(x) € L, (—@, @), 
q > 2, then f has a Fourier-Plancherel transform F € L,(—@, @). These 
results can be found in (2, Theorems 74 and 79). In neither case, however, 
does the collection of transforms cover L,, except when p = g = 2, and in 
neither case, with the same exception, has the collection of transforms been 
characterized. 

Further, if f€ L,(—@, @), 1 <p < 2, then its transform F has the 
property |x|'*” F(x) € L, (—@, ) (see 2, Theorem 80) but, except when 
p = 2, the collection of transforms does not cover the set of functions with 
this property, and again, except when p = 2, the collection of transforms 
has not been characterized. 

Our object here is to find such characterizations, and this is done for the 
various cases in Theorems 1, 2, and 3 below. This characterization is given 
in terms of an operator 


_ (—/t)™ 1 L 
B.A] = CELL [ae Plelds, b= 1,2,.... 





It transpires that this operator is an inversion operator for the Fourier trans- 
form, and its inversion theory will be the subject of another paper. 


THEOREM 1. A mecessary and sufficient condition that a function F € 
L,(—®,@),q >2, be the Fourier transform of a function in L,(—@, ©), 
with p-'+ q-' = 1, is that there exist a constant M such that 


fies. trae < mu, — e 


Proof of necessity. Suppose F is the Fourier transform of f € L, (—@, @). 
Now an easy calculation shows that for k = 1,2,..., 


— (24)*(—i)*y*e™"/k!, y <0,t>0, 

Sy Jaan =a Tenet ee = | (2x) (a) yeh, y>0,t<0, 
l 0 , yt > 0. 

Hence, since for each ¢ = 0 and each k = 1,2,..., (x — tk/t)-@*™ € 


L, (—@, @), we have from (2, Theorem 75) that 


Received April 23, 1958. 
168 

















REPRESENTATION OF FUNCTIONS 


_ (t/t) —-f- a Sa 
Be. AF] (2x)' (x—ik/t)**" F(x)dx 
onion f° gq t y" f(y)dy 

0 
(e/lely (ety? fly!" Foray, 


Thus, using Hélder’s inequality, we have for ¢ > 0 


BAA < GMM N Sy VorPay Vf vary 


Su, t)***( k! yf“ e*""Ipy)Payt 


l/p 


and consequently, 


"Ie [F]| dt < a wat dt —ky/t kk ( 
» (oes x . € yf) )? 


= a) “yoy Pay f gtd tel ty = i f(y) Pay. 
. 0 0 P 


A similar calculation for t < 0 shows that 


J ie.arira < f yorrey, 
and hence . 
Fie. trpar < f° yorray = a. 
Proof of sufficiency. For s > 0 let 
gs(s) = —(22)7% St —*— F(x)dr, 
and 


g-(s) = (207% i ty Peer, 


169 


/@ 


and denote by L,,, the Widder-Post inversion operator for the Laplace trans- 


formation; that is 


Ly, dg] = (—1)*(k/t)** g (h/t) /k', k=1,2,.... 


Now if s>é6>0, and k= 1,2,..., then 
\(xzeis)-@*) F(x)| < (x? + 8*)-@+9| F(x)| € Li (—@, &), 


since from Hélder’s inequality 


f (x? + 8%)" | F(x) Idee 


<A fete ayeemnach” 1 I" retest < @. 














170 P. G. ROONEY 
Hence by (1, Corollary 39.2), g,(s) has derivatives of all orders inO0 < s < @, 
and these derivatives can be calculated by differentiating under the integral 


sign. Thus for ¢ > 0, 


_ (= ik/t)" 
La. des) = “SE "ar Fla dde = Ba. dF 


ik/t) 
and 
(sk/t)"™ : ees i 
Ly, {g-] = (2x)! _? + x Nad F(x)dx = §.-AFI, 

so that 

Jo Wedesdae =f" .AFIPa <M, = b= 1,2,..., 
and 

| \Ly, g_}|"dt -| Ss. A FI Pat < M, ee 

0 0 


Further g.(s) 0 as s— @. For from Hdélder’s inequality we have 


- 1/p oo ) le 
eat) < 2S G+ yay J. IF) [Maef = 08"). 


Hence by (3, Chapter 7, Theorem 15a) there are functions f, and f_ in 
L,(0, ©) such that 


xs) = foe" fade, 5>0, 


and 


g-(s) = f e ** f_(t)dt, s>o0. 
0 


_Sf@, t>0, 
WO = Ve (Kn, 1 <0. 


Then clearly f € L, (—~, @) and hence by (2, Theorem 74) f has a Fourier 


transform F* € L,(—@, ~). We now show F = F* ae. 
Let 
rer l ; 
g: (s) = —(2r) if e - F* (x)dx, s> 0, 
and 
“() = ey tif” - 1 F*(x)d > 0 
g- = oe ae as x)dx, 5 ‘ 


Then since for each s > 0, (x — is)“'€ L, (—@, @), and 


[ (Qr)tie™ y<0,s>0, 


(24) *(P) f “ps ay ax =j)—(2z)ie”, y>O0,s <0, 
0, sy > 0, 














REPRESENTATION OF FUNCTIONS 171 


we have from (2, Theorem 75) for s > 0, 


g,(s) = —(27)' if iil — F*¥(x)dx 
~. x — 1s 
. J. e™ f(y)dy = J e fx(y) = g+(s), 
0 
and 
“(s) = (24) i | _ p*(x)dx 
g-(s) = (2x)7* 1 es x)dx 


20 @ 
J e” f(y)dy = e ” f_(y)dy = g_(s). 
x ee 


Consequently, for s > 0 


{ 4 - (F(x) — F*(x))dx = 0 
e >= & 


~ * 


and 


%« | 7 * 
- — F*(x r = (0). 
J x 7, (F @) F*(x))dx = 0 


Letting ¢(x) = F(x) — F*(x), the last two equations yield 


ss | 
i me 5 o(x)dx = 0, s # 0. 


Then denoting the even and odd parts of ¢ by ¢, and @» respectively, we 
have for s ¥ 0 


a l f° l 
: - @,(x)dx = — ——— x)dx. 
Fr. x + 1s ¢ J. x + 8S ool 
But the function on the left of this equation is an odd function of s while 
the function on the right is even. Hence each is zero, so that for s # 0 


—_—_ T eae 
—+——3 ¢ =-= —.- %,(x)dx = 0, 
J Yas o.(x)dx ot epee o.(x)dx ) 
and 
a ; ee” ” ~~ 
J eos do(x)dx = 5 | rs: do(x)dx = 0. 
Thus for each s > 0, 
gi —? 4 = 2f- I a 
J 2+ ;* o.(x’)dx = *h.2 4; o,.(x)dx = 0, 
and 
ft - go(s!)de = 2 ~— $o(x)dx = 0 
ox+s ” tails +s ete F 


and hence by the uniqueness theorem for the Stieltjes transformation (3, 
chapter 8, Theorem 5a) ¢, and ¢» are zero almost everywhere. Thus ¢ is zero 











172 P. G. ROONEY 


almost everywhere so that F = F* almost everywhere, and F has the pre- 
scribed representation. 

For Theorems 2 and 3 let us denote by Y (—@, ~) the collection of 
functions f such that |x|'-*/" f(x) € L, (—@, @). 


THEOREM 2. A necessary and sufficient condition that a function F € 
L,(—@, ©),qg > 2, be the Fourier transform of a function in GZ (—@, ©), 
q > 2, is that there exist a constant M such that 


f ltl "1Se. LF] |"dt < M, k>q-2. 


Proof of necessity. Suppose F is the Fourier transform of f€ Z(—@, @). 
Then as in the proof of Theorem 1, for > 0 and k > gq —2 


l/¢ 


[Sx AF]| < yce/o*(e fe" vV0) Mart 


and consequently if k > g — 2 


ov k+1 co bh 
Jy. crates < A fleas fle st yon tay 
RP 


J f(y) "ay | eg tult e-k-3 
BE Jo 

K(k) J. y* f(y) |* dy, 

where K(k) = k*' '(k — q + 2)/k! Similarly 


0 0 
Siete. raitae < Ke) fyi orKtay, 


so that 
SP vel ies, dr itae < Ke) J yi Orley. 


But from Stirling’s formula, 
lim K(k) = 1, 


kaw 


so that K(k) is bounded for k > g — 2. Hence there is an M such that 
f \t|** |Se. AF I|"dt < M, k>q-—2. 
Proof of sufficiency. Let g, and g_ be defined as in the proof of Theorem 1. 


Then as in that proof, for t > 0 


Ly dlg+) = Fel FI, 
and 

Ly. A(g-] = §-AFI, 
and hence 








an 


be 


Bi 





ee  —— ee ee 





REPRESENTATION OF FUNCTIONS 173 


Soa desta . Jee. Arita <M, &>e-3 


and 
foe deta = fee AF <M, | k>g-2. 
0 0 


Consider first g,. By (3, chapter 1, Theorem 17a), with a,(#) = ¢'-?/*L,, [gs], 
there is a function f, with #'-*/*f,(#) € L, (0, ©), and an increasing un- 
bounded sequence of integers {k,} such that for any function 8(t) € L, (0, @), 


tim | g(t) ’-*" Ly, gs Jat = fraw oF, (t) dt. 
too 0 0 


But for each s > 0, t-“-?/ e-** € L, (0, ©), and hence choosing this as our 
B(t) we have for s > 0 


@ 


lim q - Ly;, g+|dt = f e f+(t)dt. 


ta 0 


However, for x > 0, 


z z lip( pz le 
f Le, dg+]|dt < if Patt 1) ia Le desll*ae} 
<(p-—1)7" Mx“ =O(x) as xo, 


and as in the proof of Theorem 1, g,(s) +0 as s — ». Hence by (3, chapter 
7, Theorem 11b), 


lim f e** Lax. dgs|dt = g+(s), s> 0, 
tro 0 
and thus 
g+(s) = f e** f, (t)dt, s>0. 
0 
Similarly f_ exists with t'-*/¢ f_(#) € L, (0, @) such that 
g-(s) = f e ** f_(t)dt, s> 0. 
0 
Let 
a f+, t> 0, 
f@) = an t<0. 


Then clearly f € FZ (—@, @), and hence by (2, Theorem 79) f has a Fourier 
transform F* € L,(—, ©). It remains to show F = F* a.e., which now 
follows as in Theorem 1. 


THEOREM 3. A necessary and sufficient condition that a function F € 
Z,(—@©,”),1<p<2, be the Fourier transform of a function in L,(— ©, @) 
is that there exist a constant M such that 


f° ies. Ariat < M, bat%.... 











174 P. G. ROONEY 


Proof of necessity. li F © Y,(—@, @) is the Fourier transform of f € L, 
(— ©, o) then by (2, Theorem 74), F © L, (—@, ~), and hence by Theorem 
1, there is a constant M so that 


| Se. [F]|"dt < M, k=1,2,.... 


Proof of sufficiency. Let g,(s) and g_(s) be defined as in Theorem 1. Then 
as in that theorem, 


| Le. dgs(s)]|"dt < M, yy 2h ae 
0 


and 
| \Ly g_(s)]\"dt < M, 2 3 ae 
0 


Further g,(s) ~ 0 as s— @. For from Hdlder’s inequality we have for s > 0 
j %@ lxi*-* | ie e~ a ae ( ? 
lgs(s)| < WU wene dx ( J | x|” F(x) |’dx ¢ = 0(s~**), 


Hence by (3, chapter 7, Theorem 15a), there are functions f, and f_ in 
L, (0, ~) such that 


g.(s) = fre * f,(t)dt, s>0 

and 
g_(s) = ve ** f_(t)dt, s> 0. 

Let 
10 = Yn, 10. 


Then clearly f © L, (—@, @) and hence by (2, Theorems 75 and 80) f has 
a Fourier transform F* € Y(—@, -). It remains to show that F = F* a.e., 
and this follows as in Theorem 1. 


REFERENCES 


1. E. J. McShane, Integration (Princeton, 1944). 
2. E. C. Titchmarsh, An introduction to the theory of Fourier integrals (2nd ed.; Oxford, 1948). 
3. D. V. Widder, The Laplace transform (Princeton, 1941). 


University of Toronto 








we 














THE ELLIPTIC INTEGRALS OF THE THIRD KIND 
E. H. NEVILLE 


This paper develops a case for adopting as the standard elliptic integrals 
of the third kind the function IIs(u, a) defined by 


f qs a qs’ a du 

IIs(u,a) = 3 5 

0 qs u“—qs a 

and the three functions IIs (u, a + K,), Ils(u,a + K,), ls(u,a + Ky) where 
K., K,, Kq are the three quarter-periods of the Jacobian system. The function 
IIs(u, a) is the same function whether qs wu is cs u, ns u, or ds u. 

The origin of the paper was a wish to understand how it has come about 
that the integrals commonly accepted as standard are not related sym- 
metrically to the theta functions in terms of which they are expressed. The 
explanation of this irregularity is in three parts: 

(1) The first of Jacobi’s formulae for evaluating an elliptic integral is a 
deduction from the identity 


8°0 O(u + a) O(u — a) 


0. 5 ; 
(0.1) Oa Ovu 


= 1—csn asn u. 
(2) To cover the range of real integrals with real variables it is necessary to 
use in addition to 6(u + a) O(u — a) the three products 


@,(u + a) O,(u — a), H(u + a) H(u — a), Hy(u + a) Hi(u — a). 


(3) If the only elliptic functions recognized are sn u, cn u, dn u, the only 
denominator which can be associated with the products in (2) is 6%a 0%. 

The third part of this answer is the mischief-maker leading to a set of in- 
tegrals with no community of structure. 


1. The notation is the systematic notation used in my Jacobian Elliptic 
Functions (8), including that for bipolar functions suggested in the preface 
(p. iv) to the second edition (1951). Except that he prefers w, to K,, it is adopted 
by Lenz in his paper (7) written as a tribute to Faber. Glaisher’s function 
pq u is the function with simple zeros congruent with K, and simple poles 
congruent with K, and with 1 for its leading coefficient at the origin. 

The bipolar function bpq u has simple poles congruent with K, and K, 
and simple zeros congruent with the other two of the four points K,, K,, 
K,,, Ka; since these other points are the zeros of the derivative pq’ u, the bipolar 
function is a multiple of the logarithmic derivative pq’ u/pq u and we obtain 


Received July 6, 1958. 








176 E. H. NEVILLE 


a definite function by again requiring the leading coefficient at the origin to 
be 1. Then 

(1.1) bps u = —ps’ u/psu = sp’ u/sp u 

and if the origin is neither pole nor zero 

(1.2) bpq « = sp’K, pq’ u/pq u. 


Explicitly, bpq u = rputqu =tpurqu, but more often than not the 
arbitrary coupling of a zero with a pole is an irrelevant nuisance. Since 
ps u ps(u + K,) is independent of u, (1.1) implies 
(1.3) bps(u + K,) = — bps wu. 

The theta functions I use also have 1 for leading coefficient at the origin. 
For Hu/H’0, H,u/H,0, 6u/00, 6,u/0,0 I write 3,u, du, 3,u, du, relieving 
the memory by associating each of the functions with its lattice of zeros. 


The quotient 8,u/0,u is the elliptic function pq u. 
The quarter-period relations between the theta functions are 


(1.4, 1.5) ou = Ad,(u + K.),0,u = Be“d,(u + K,) 
(1.6) ou = Cd,(u+K,.) = Dée“d,(u + Kz) 


where A, B, C, D, \ are constants whose values are not needed in this paper. 
From these relations it follows that the function zp wu defined according to 
Lenz's notation (7) by 


(1.7) zp u = 0, u/d,u 

satisfies the quarter-period relations 

(1.8) zcu = zs(u + K,), zd u = zn(u+ K,), 
(1.9) znu = zs(u+ K,) +A, zd u = zs(u + Ky) +X. 


Since #,u is a multiple of Ou, the logarithmic derivative zn u is identical with 
the function Zu defined by Jacobi. 


2. In terms of the function #,u, Jacobi’s identity (0.1) becomes 


dv, d,(a — 2 2 
(2.1) male Fs oats =—S... 1 —csnasnu =A, 


and if we alter the numerators in turn, but not the denominator, we have 


dala + u)d4(a — u) 











(2.2) 9.00.0 =ccn’acn’u +c’ = Ay, 
(2a) + a) — 9) « onta — sn’u = Ay, 
(2.4) of + ese = = ¢'dn’adn’u —c'c = A,. 








_——_—_—— ————, - ~— 








—_— | — 








ELLIPTIC INTEGRALS 177 


It was all but inevitable that before the discovery by Glaisher in 1882 of the 
complete group of twelve Jacobian functions the integrands to be associated 
with Jacobi’s integrand 


(2.5) I, = —} 0 log A,/da = csnacnadnasn*u/A, 
should be 

(2.6) I, = —} 0 log A,/da = csnacnadnacn’u/A, 
(2.7) I, = —} dlog A,/da = —snacnadna/A, 
(2.8) I, = —} A log A./da = snacnadnadn'u/A, 


but a revision in the light of Glaisher’s discovery is long overdue. 


3. If 
v,(a — u) 

=> = 4 a? 
(3.1) A, = A,(u, a) ) log 5 (a ta 
then 

Olog 4, _ _ 4 PAp 

(3.2) — 2 au 2zna 
and therefore 
(3.3) f I,du = A,(u,a) + uzna. 

0 


This is Jacobi’s argument. The relation between the integrals is clear if 
we replace (3.3) by 


(3.4) f (I, — zna)du = A,(u, a) 
0 


but zn a is not an elliptic function of a, and we can only regard the integrals 
in (3.3) as forming not one set of peculiar interest but one of the four sets 
of the more general form A,(u, a) + u zq a. 

So much was evident a century ago, and Enneper (2, §34) recorded the 
integrands corresponding to the sixteen combinations. The calculation is 
simple. Since zq a — zn a = qn’a/qn a 


me qn’a) 3 ("2 4;) 
(3.5) Ay(u,a) +uaqa = f'(1, +2) au = 4 2 (toe 4, ” 


For given p, and q other than n, the denominator A, can be put into the form 
Uy, qn’a + V,,, where U,,, Vp, do not involve a, and then 


r) 4, \_ _ 2V,,qn'a 
(3.6) da (10g 2r) ~  A,qna * 
Hence, for g other than n, 
ad qn'e f Youde 
(3.7) A,(u,a) + uzqa aneds & 











178 E. H. NEVILLE 


and the integrands which yield the sixteen integrands are given in terms of 
sn u, cn u, dn u compactly and explicitly in Table I. 


TABLE I 


0,(a—u) v,'a 
- WITH RESPECT TO u 


DERIVATIVE OF } k — u 
Bs 8 5 (a +u) 0a 


\ @ 
~ s r n d 
b 
s -sn* u cn? u —| c'dn*u + sn?a — sn*u 
r cn*u —c'sn?u dn*u —c~ic’ + cdn*adn*u — cmc’ 
n l dn?u c sn®u cn*u + 1 — ¢sn*a sn*u 
d dn*u c’ c cn? c’sn®u + ccna cn*u + c’ 


<sn’‘a/sna Xcn’a/cna Xsnacnadna Xdn’‘a/dna 


As functions of u, the integrands in this table are multiples of the sixteen 
fractions each of which has one of the four numerators 1, sn’u, cn’u, dn2u and 
one of the four denominators A,, A,, A,, Ay. In this sense the set is complete, 
but the structure, so clear from the integrals, is utterly obscure when only 
the integrands are displayed. 


4. In using (3.7) we have completed our table of integrands from its third 
column, but since zq u — zr u = qr’ u/qr u, we could as easily complete a 
row from any one of its members, and we now ask if a different choice of 
standard integrals and a free use of Glaisher’s notation will clarify the pattern 
of the intezrands. 

The clue is in the effect of quarter-period addition on the theta functions. 
A quarter-period addition to a is a quarter-period addition to the arguments 
of the two theta functions in A, and to the arguments of the two theta func- 
tions in zq a, and if p and g are the same, only one transformation is involved. 
Let us then deine a set of four integrals by writing 


(4.1) IIp(u,a) = A,(u,a) + uzpa 
and complete the set of integrals by means of the identity 
(4.2) A,(u, a) + uzqa = IIp(u, a) + u qp’a/qpa, 
From (1.4) and (1.6) applied to (3.1) 
A.(u,a) = A,(u,a + K,), Aq(u, a) = A,(u,a + K,) 
and therefore from (1.8) 
(4.3, 4.4) I'c(u,a) = IIs(u,a+ K,), Iid(u,a) = IIn(u,a + K,). 
Also from (1.5) 


v,(a—u) —_»0,(a+K, —u) d,'a d,'(a + K,) 


0,(a + u) 2 d,a+K,+u)' d,a dv, (a+ K,)’ 














am 


, 








ELLIPTIC INTEGRALS 


and therefore 
A,(u,a) = — ru + A,(u,a + K,), zna = \ + zs(a + K,), 
implying 
(4.5) IIn(u,a) = IIs(u,a + K,). 
As functions of a, the integrands with which we are dealing are periodic in 
2K, and 2K,; hence 
IIs(u,a + AK, + K,) = IIs(u,a + K,), 
and from (4.4) and (4.5) 
(4.6) Iid(u,a) = IIls(u,a + K,). 
Thus for p = c, n, d, 
(4.7) IIp(u,a) = IIs(u,a + K,). 
In my book, Ilp(u, a) is defined by this formula, and not directly in terms 
of the theta function J,u. 
The structure of the set of integrals 
IIs(u, a), Mc(u, a), Mn(u, a), Md(u, a) 
is symmetrical, for if IIp(u, a) is any one of the four functions, then 
IIp(u, a), Ip(u,a + K,), Ip(u,a + K,), M,(u,a + Kg) 
are the same four functions looked at, so to speak, from K,. To put the matter 
differently, the symmetrical relation 
(4.8) IIq(u,a + K,) = Up(u,a + K,) 
shows that no one of the functions dominates the set. Briot and Bouquet 
(1, p. 447) complete the set from IIn(u, a) and associate each function 
IIn(u, a + K,) with one theta function and each difference IIn(u, a + K,) 


— IIn(u,a) with one elliptic function, but their notation does not achieve the 
economy of typical formulae. 


. To use an integral we st be able to recognize the integrand. We denote 
5. 1 tegral we must be able t tl t 1. We denot 


the integrand corresponding to IIp(u,a) by J, or if necessary by J,(u, a). 


In terms of theta functions 


J, = OA,/du + zpa, 


} 


but what we have to consider is the explicit expression of J, as an elliptic 
function. The four integrands satisfy the same quarter-period relations as 
the functions from which they are derived or, in other words, satisfy the 
typical relation 


(5.1 J,(u,a + K,) = J,(u,a + K,) 


derived from (4.8). 








180 E. H. NEVILLE 


In our table in §3, the functions J,, J., J,, Jg occupy the principal diagonal 
where they appear as follows: 








f 3 , , 2 
sn’a sn“u c’ cn’a sn“u 
(5.21,5.22) J,= on e(en’e — entu)’ J~-=- cn ale tdn’a dn'u — ec)’ 
2 , ’ 2 
(5.23, 5.24) J, a Smecmednems | c’ dn’a sn“u 








1 — csn’asn’u ~ dna(ecn’acn’u +c’) ° 
We may suggest that it is because only the original Jacobian functions sn u, 
cn “, dn u are used that the symmetry of the quartette cannot be seen, but 
since each function might be expressed in terms of any one of the twelve 
functions pq u, we are not likely to find satisfactory transformations by a 
process of trial and error. 

We take a hint from the Weierstrassian theory, in which the fundamental 
integrand of the third kind is p’a/(pu — pa), and we have 


f p’a du insti a(a — u) 2u oa 
0 





pu — pa Scata)* oa 
If the Weierstrassian functions have the same lattice as the Jacobian functions, 
8,u and ou are integral functions with the same zeros, and the relation between 
them is 
cu = eo” 3 ,u, 


where yu is a constant. Hence 


i v.(a— 4) oa _ da 
4 pon + log) a+ 2)’ ee “ ad aL 


o(a—u) 
Cc(at+u) 

and therefore 
“ padu _ 
f os = & -s 2{A,(u,a) + uzsa}, 

that is, 

-_ 1a 
pu — pa” 


Since pu differs from qs*u by a constant, whether q is c, n, or d, we have 





_ _qsaqs‘a 
(5.3) J,= ws - ae 


and therefore 
_ 9s(a + K,) qs’(a + K,) 
(5.4) J,(u, a) qs u ae qs’ (a + K,) ’ 
a general formula which includes (5.3). 
To verify that the formulae (5.21—5.24) extracted from the table in §3 can be 
deduced from (5.4) is an exercise in algebra. First, qs’a = — sq’a/sq* a, gives 








, 2 
o sq’a sq : 
(5.5) J, — sq a(sq’a sag sq’u) ’ 





ee ee ee 





COO0”ne™ _—_—— 


ELLIPTIC INTEGRALS 181 


this formula includes (5.21), and shows that in spite of appearances the inte- 
grand given by (5.21) does not stand in any special relation to K,. 
Next, since ps a(ps a + K,) = ps’K,, identification of q with p in (5.4) gives 


_ "K, sp a sp’a 
> psu — ps’ K,spa' 
that is, 

72 , 2 
ps’ K,spasp’aspu. 
1 — ps”K, sp asp u' 





(5.61) J, = 


this formula includes (5.23), identifying Jacobi’s integrand with J,(u, a + K,). 
In other words, IIn(u,a) is Jacobi’s function II(u,a) seen as one member of a 
set of which the other two members IIc(u, a) Ild(u, a) have their integrands 
given by 

c’ sca sc’a sc’u cc’ sd a sd’a sd*u 


(5.62, 5.63) J.= nai 1 + cc’ sd’a sd*u * 


~ 1—e'scascu' 





Lastly, to recover (5.22) and (5.24) from (5.4), we suppose q to be distinct 
from p and r to be the third member of the set c, n, d; then 


(5.71, 5.72) qs(a + K,) = qsK, rpa, qr(a + K,) = qrK,rq a. 


Since sr*u{qs*« — qs*(a + K,)} is a linear function of qr*u which is zero 
only if qr*u = qr*(a + K,), it follows that qs*« — qs*(a + K,) is a multiple 
of rs*u(qr*a qr’u — qr*K,), and is therefore the product of 


rs’'u(pq’ K, qr’a qr'u + pr K,) 


by a factor independent of u. Determining the factor by putting u = K, 
and using (5.71), we have 


qs*u — qs*(a + K,) = rp’a rs*u(pq’K,qr*a qr*u + pr*K,). 
Using (5.71) again and replacing qs*K, rp’a/rp a by ps*K, pr’a/pr a, we have 


7 ps’K, pr’a sr'u _ 
~ pra(pq K, qraqru + prK,)" 





(5.81) J, 


This is the formula of which (5.22) and (5.24) are two cases; a third case is 
another formula for Jacobi’s integrand: 


2 
cne’asc u 
7-1 2 2 gly + 
nc a(c’ dea dc'u — cc’) 


J, = 





In fact there are six cases of (5.81), but the interchange of q and r is almost 
trivial. The direct transformation of (5.82) into (5.23) takes the form 


2 2 
cnc’a sc'u cc’ cnacn’asn’u 
= — ‘ho = 5 3 z 
ne a(c’' dea de*u — cc’ ) dn‘a dn*u — ccn’acn’u 








S cc’ sn a sn’a sn*u 
~ (1 —esn’a)(1 —csn’u) — c(1 — sn’a)(1 — sn*u) * 














182 E. H. NEVILLE 


6. The relation between IIp(u,a) and Ilq(u,a) can be expressed as a relation 
between functions instead of as a relation between arguments, for (4.1) gives 


7 = h loo PAS — *) pq'a 
(6.1) Iip(u, a) — Iq(u, a) 4 log va (a +a) +8. oe: 
In other words, an alternative definition of IIp(u,a) in terms of IIs(u, a) is 
(6.2) IIp(u,a) = Is(u, a) + 4 log 2 — 2) pee 


ps(a + u) M osa’ 


The additional logarithmic ambiguity is only apparent if it is understood that 
the logarithm is zero when u = 0 and varies continuously as u describes the 
path of integration implicit in Is(u, a). 

It is interesting to establish (6.2) in terms of integrands. With differences of 
notation, the algebra is essentially Legendre’s (5, §46; 6, §49). With the use 
of the bipolar function, the addition theorem for ps u can be written 


ps(u + v) = psu ps v(bps u — bps v)/(ps*« — ps’v). 
Hence 
(6.3) pate — s) _ bpss + bpsa 
ps(a+u) bpsu — bpsa 
and the result to be proved is, that if a, = a + K,, then 


PS ay Psd _ __ps@ps @ bpsabps’u , psa 
psu—psa, psu—psa bpsu—bpsa' psa’ 
since 

ps’a = —psa bpsa, ps’a, = —psa, bpsa, = psa, bps a. 
From (1.3), this is equivalent to 

: bps’u sa 98° ‘a 
(6.4) gE wm} —y OS, y, _, Be 

~ bps*u — bps’a psu — psa psu — psa,’ 

Now ps*u(bps*« — bps*a) is a quadratic function of ps’w which vanishes 
if ps*w = ps*a and therefore also, from (1.3), if ps*w = ps*a,; also the co- 
efficient of ps‘u in ps u bps?u, that is, in ps’’u, is 1. Hence 
(6.5) ps*u(bps*u — bps*a) = (ps*w — ps*a)(ps*w — ps*a,). 
Multiplying by sp*u, differentiating, and substituting for ps’u and sp’u from 
(1.1), we have 

bps u bps’u = —bps u(ps’u — ps*a ps*a, sp*u), 
that is, 
(6.6) —ps*u bps’u = ps*un — ps*a ps*a,. 
From (6.5) and (6.6), 


(6.7) _ ___ bps’ . __ps‘u — - psa PS ay — 
bps’ — bps’a (ps’u — ps’a)(ps°’u — ps ay) 











al 


fi 








ELLIPTIC INTEGRALS 183 


and the right-hand side of (6.7), resolved into partial fractions in the variable 
ps*u, is the right-hand side of (6.4). 


7. Since IIs(u, a) is an odd function of a, (6.2) can be written 


(7.11) Iis(u, a) + IIs(u, K, — a) = 4 log pets) _ u PSS. 


ps(a — u) ‘psa’ 
further, since IIs(u, a) as a function of a, has 2K, for a period, 
Iis(u, K, — (a + K,)) = Ils(u, (K, — a) + K,), 


and substituting a + K, for a in (7.11) we have for q # p, 


7.12) Iiq(u, a) + Iiq(u, K, — a) = bing ES ~ rt 

The formulae (7.11) and (7.12) may be regarded as halving the area of values 

of a throughout which IIq(u, a) requires a theta function for its evaluation. 
From these formulae we see also that if 2a is a quarter-period the integrals 

of the third kind degenerate. Since the value of ps($K, + u)ps(4K, — u) 

is ps?3K,, we have from (7.11) 


(7.21) Ts(u, $K,) = } log {sp 4K, ps(u + 4K,)} + 4 u bps }K,. 


Also rq($K, + u)rq(4K, — u) is a constant, since addition of K, to mu inter- 
changes the poles and the zeros of rq u; this constant is rq? 4K,, and we have 
from (7.12) 


(7.22) Tlq(u, }K,) = $log {qr $K,rq(u + 4$K,)} — 4u(bqs}K, — brs}K,), 


since rqa = rsa/qsa. 
For the sake of completeness we must add that the identities 


IIp(u, K, — a) = — Is(u, a), Mp(u,a) = — Is(u, K, — a) 

imply 
(731) p(s, a) + Up(u,K, — a) = }log E+ _, 2 

; . wi *"* sp(a — u) “spa 
(7.32) Ip(u, $K,) = § log {ps 4K, sp(u + 4K,)| — 4 u bps }K,. 
To us, (7.31) and (7.32) are little more than repetitions of (7.11) and (7.21), 
but we must remember that since the function we are denoting by IIn(u, a) 
was known long before IIs(u,a@) was introduced, the classical formulae im- 
plicit in Jacobi’s theorema de additione argumenti parametri (4, p. 159) are 
cases of (7.22) and (7.32). 

The values of the bipolar functions used in (7.21) and (7.22) are easily 
found. For any value of, u, 


’ 


(7.41) qs 2u + rs 2u = bps u, 
and therefore 


(7.42, 7.43) bps 4K, = qsK, + rsK,, bqs }K, = rsK,. 











184 E. H. NEVILLE 


Thus (7.22) becomes 
(7.44) Ilq(u, $K,) = } log {qr}K,rq(u + 4$K,)} + 4u(qs K, — rs K,). 

We can modify the logarithmic terms in (7.21) and (7.22) and take fuller 
advantage of (7.41) and (7.42). From (6.3), (7.11) is equivalent to 


— - a bps u — bpsa 
(7.51) Ts(u, a) + Is(u, K, — a) = } log beso + bese + u bps a, 
and therefore (7.21) is equivalent to 
bps u — qsK, — rsK, 
bps u + qsK, + rsK, 





(7.52) Tls(u, 4K,) = } log + 4u(qsK, + rsK,). 


Instead of (7.12) we have 
(7.53) Tq(u,a) + Iq(u, K, — a) 


(bqs u + bas a)(brs wu — brs a) 
(bqs u — bas a) (brs u + brs a) 





= log — u(bqsa — brsa), 


leading to 
(7.54) Tlq(u, } K,) 


" (bqs u + rsK,) (brs u — qsK,) 
°8 (bas u — rsK,) (brs u + qsK,) 


The squares of the constants qsK, are given by 


(7.61) ns*K, = —cs*K, = 1, ns*K, = —ds*K, = c, ds*K, = —cs*K, = ¢’, 





+ 4$u(qsK, — rsK,). 


and depend only on the Jacobian system, but the constants themselves with 
the exception of nsK, depend on the choice of a basis for the lattice. Defining 
uv, k, k’ by 


(7.62) v = scK,, k = ns(K. + K,), k’ = dsK,, 
we have 
(7.63) v= —-le=c, k2 =, 


and the six critical constants are given by 


(7.64) nsK, = 1, csK, = —v, nsKy = —k, 
dsK,, = —vk, dsK, = k’, csKg = vk’. 


The relations 
nsK ./csK, = dsK,/nsK,z = csK,/dsK, = v 


express that rotation in the direction K, — K, — Kg is positive or negative 
according as v is +i or —1. 

The results of expressing the constants in (7.52) and (7.54) in terms of 
v, k, k’ are valid for all Jacobian systems, but it is for the classical systems 
in which k and ’ are real that they are specially required. 





—_ — —— - 


r 





ELLIPTIC INTEGRALS 185 


8. In proposing that the typical integrand in the table in §3 should be treated 
as J,(u,a) + qp’a/qpa rather than as I,(u,a) + qn’a/qna, we are not 
altering the composition of the table. The integrands are the same sixteen 
functions of u and a, and the most to be claimed is that with the whole set of 
Glaisher’s functions at our service we have shown that we can move easily 
from one entry to another within the table. To Hermite (3) are due examples 
of a process by which the tale of recorded integrals of the third kind can be 
quadrupled in length. The denominator A, in (2.1) is the denominator in the 
classical expression for sn(a + u), and since 


0,(a + u)/),(a + u) = sn(a + u) = (snacnudnu +cnadnasnu)/A, 
we have 


qi) Eee) 





=snacnudnu+cnadnasnu. 


Jacobi’s argument now gives 





(8.2) f2 adnacnudnu — sna(dn’a + ccn’a)sn u o 
0 


snacnudnu+cnadnasnu 


a(a+u) _o, dla 


- 5G — =) “d,a° 


This method gives integrands corresponding to the 48 integrals 


} log 2262 — *) =) + ». 
0,(a + u) v,a 
with p # r, but Hermite himself attached no importance to the extension. 
His comment, “‘au fond, ces diverses expressions se raménent a la quantité .. .” 
II(u, a), suggests only that he was dissatisfied with the incoherent mass of 
formulae derived from Jacobi’s integrand and its three companions. 
More interesting than this extension is Hermite’s use of the integrand 


snacnadna/(sn*u — sna), 


which is the integrand denoted above (§§2-3) by J,, in preference to Jacobi’s 
integrand J,, or, in other words, his use of the integral A,(u,a) + uzna in 
preference to Jacobi’s integral II(u,a) which is A,(u,a) + uzna. “Cette 
intégrale présente,”’ he says, “‘plus de facilité que celle de Jacobi pour établir 
les théorémes sur |’addition des arguments” (3, p. 841). That is to say, he 
has found that the advantages of using the function A,(u, @) associated with 
the origin instead of the corresponding function A,(u,q@) associated with the 
point K, outweigh any disadvantages due to the heterogeneity of A,(u, a) 
+ uzna as compared with A,(u,a) +uzna. And this in spite of the 
fact that for elliptic functions he has only those whose poles are congruent 
with K,. 











186 E. H. NEVILLE 


9. The integrands tabulated in §3 are functions to which Jacobi’s method 
of integration is seen in advance to be applicable; we have still to consider the 
arbitrary integrand \/(pq*u — u). Determining a constant a by the condition 


(9.11) 


pq* a = un, 
and inserting a numerator found to be convenient, we deal with the integral 


“Pd @ pq‘a du 
0 pq’u — pq-a’ 


If q is s, the integral is already known, for (5.3) is equivalent to 


(9.12) “ps @ pee du 


= IIs(u, a). 
o ps u — psa 


If q is not s, then pq*u is a linear function of sq* u; whether pq*u is sq*u or 
1 — qs*K,sq*u 
apq’a _ _sqasqa_ 
pqu—pqa squ—sqa’ 
and since 


saqsa _qsa _qsa@_ _sq@ ___ squ 
qsu—qsa qsa ‘ qs’u — qs’a sq a" sq’u — sqa 
we have 
(9.13) baa pala | sands _ nelu,6), 
sqa Jo pq’'u — pqa 
(9.14) yy 


opqu—pqa  qsa 
Although (9.13) is valid whether or not p is s, it is worth while to separate 


the two cases for the sake of further simplification. If p is s, the formula is 


(9.15) sq’a fa sq° ‘ue du 


= IIs(u, a), 
sq aJosqu — sqa 


a simple variant of (9.12), and if p is not s it can be written 


(9.16) ps K, Arsae [ sq udu — «'Seihe a). 


sq a 0 pq u — pqa 
since 
qs°K, = —ps’K,. 
The earliest of all integrals of the third kind, Legendre’s function II defined 
by (5, p. 17; 6, p. 17) 
% 
1-f_—_* 


0 (1 + n sin’o)A : 





od 
he 
on 


ral 


or 






































ELLIPTIC INTEGRALS 


where A = +/(1 — csin’@), is the integral 


fs “7 
ol+nsnu’ 


It is usual now to change the sign in the denominator, and we take the integral 
of this form with sn « replaced by pq u as 


a Sn 
ol—Apqu 
If we define a by 
(9.21) qp’a = X, 
we have 


qp’c/qpa _ _pqaapq'a 
1—Apqu pqu—pqa’ 


and we have merely to rewrite (9.12), (9.14), (9.15), and (9.16) as 


(9.22) ee f SS 
spaJol—spa psu 

(9.23) ap — Fo SEE +. ote, 0), 
qpaJo1l—qpapqu qs a 


“qs @ qs’a sq'u du 


9.24 ; 
( Jo 1—qsasqu 


= IIs(u, a), 


ps K, ps’a (“ —sq’udu 


(9.25) = IIs(u, a). 


psa wJo1—qpapaqu 


There is an alternative substitution. The function pq u has one of the 
quarter-periods of the Jacobian system for a half-period, and if this quarter- 
period is K,, the product qpu qp(u + K,) is independent of u, that is, is a 
constant of the system. If the square of this constant is j,,, to write 


(9.31) A = jog pq’a 
is equivalent to writing 
qp?(a + K,) =A, 


and this change replaces IIs(u,a) by IIt(u,a). The quarter-period relevant 
for ps u and sp wu is K,, and if the three quarter-periods of the system are 
K,, K,, K,, then sp(K, + K,) = —spK, and 


Jos = sp’K, sp’K,; = jug = Qs°A, qs*K,. 
The quarter-period relevant for pq u is K,, and 
jne = Op’ K,. 


From (9.22), (9.24), and (9.25) we have 








188 E. H. NEVILLE 














(9.32) ms J 7 == say = Up(u,a), 

(9.33) Jpn ti estes eee — Tg(u, 0), 
lly : 

(9.34) id J : — pe. = Iir(u, a), 

and from (9.23) 

(9.35) ate |’; _ — sata 7 i + Iir(u, a). 


We can now see the structure of the integrands which compose the leading 
diagonal of the table in §3. Since the only functions to be used are Jacobi's 
three functions sn u, cn u, dn u, the denominator has one of the two forms 
pn*u — pn*a, 1 — jp», pn*apn*u. The integrand J, corresponding to IIs(u, a) 
is the integrand in (9.15) with n for q; to use (9.16) would be merely to sub- 
stitute —(cn*u — cn’a) or —(dn*u — dn*a)/c for sn’u — sn*a. The function 
IIn(u,a@) comes only from (9.33), and since j,, = ns*K.ns*(K, + K,) = ¢, 
we find the integrand J, as csnasn’asn*u/(1 — csn*asn*u), precisely as 
given by Jacobi. The functions IIc(u, a) and IId(u, a) come from (9.34), the 
one when pq wu is dn u and the other when pq wu is cn u, but we must express 
qr’a/qr a as —rq’a/rq a; the constants required are given by 


ds*K, = —c, jan = 1/c’; cs*°K, = —l, ja = —c/c’ 


and the entries in the table can be verified immediately. 


10. As we have said, the substitution 4 = pq’a does not impose any res- 
trictions on yu, and theoretically the two formulae (9.12) and (9.14), together 
with the expression of IIs(u, a) as A,(u,a) + u zs a, reduce any function of 
the third kind to a combination of functions each of which is a function of a 
single argument. But if the problem is the evaluation of a real integral by 
means of real variables, there are complications. A real value of u does not 
necessarily give a real value of a, and if u is real and a complex, then functions 
of a + wu are functions which must be dissected before they can be evaluated. 

In discussing evaluation, we assume that K, has a real value K and K, an 
imaginary value iK’, and we assume also that K and K’ are positive; then 
k and k’ are positive, and v is i. The origin and the points K, K + iK’, iK’ are 
the corners of a rectangle which we denote by SCDN. In applying general 
formulae it is important to remember that K + iK’ is —Kz,, since in the 
formal theory the three quarter-periods satisfy the symmetrical relation 
K.+ K, + Kz = 0. 

The path of integration is a segment of the real axis. For the present we 
continue to take u = 0 for the lower limit; the effect of removing this restric- 
tion is considered in our concluding paragraph. 








—__— ————- CO 


ane of —e- 








ELLIPTIC INTEGRALS 189 


If one of the twelve functions pq’a is real, all of them are real, and therefore 
each of the functions pq a and each of the derivatives pq’a is either real or 
imaginary. Hence in all that follows each of the functions IIp(u, a) is either 
real or imaginary. To put in a real form a formula in which Ip(w, a) is in fact 
imaginary, we write 


IIp(u, a) = ill’p(u, a); 


if one of the two functions IIp(u, a), II’p(u, a) is imaginary, the other is real. 
This notation is extremely convenient for our purpose here, but is obviously 
not susceptible of extension for general use. 


11. The three functions cs*u, ds*u, ns*w are real on the perimeter SCDNS 
and decrease steadily from + © to — © as u describes the contour; cs*u changes 
sign at C, ds*u at D, and ns*u at N. Hence psa ps’ a, which is — csads a ns a, 
is real if a is on SC or DN, imaginary if a is on CD or NS. It follows that 
IIs(u, a) is real if a is on SC or DN, and II's(u,a) is real if a is on CD or NS. 
We identify the side to which a belongs by reference to the value of one of 
the functions pq’a; most simply ds*a decreases from + © through c’ to 0 
along SCD and from 0 through —c to — © along DNS. 

To locate a on a side of the fundamental rectangle by means of a real 
variable, we write a = K, + 6 or a = K, + ib’, where K, is one of the two 
corners available, and we have four pairs of formulae: 


ds*a > c’ 
(11.11) a=0 IIs(u, a) = IIs(u,b) = <A,(u,b)+uzsb 
(jJ1.12) a=K-—b IIls(u,a) = —TIIc(u, 6) = —A,(u, 6b) — uzc b, 
c >ds*a = 0 
(11.13) a=K-+ 7 ill’s(u,a) = ill’c(u, ib’) = A,(u, ib’) + u zc id’ 
(11.14) a = K + iK’ — id’, 
ill’s(u,a) = —ilIl’d(u, ib’) = —Ag(u, 1b’) — u zd id’, 
0>ds%a> —c 
(11.15) a= K+ ik’ —b 
IIs(u,a) = —IId(u, b) = —A,(u, b) —uzdb 
(11.16) a =iK’+0) IIs(u,a) = IIn(u,b) = <A,(u,b) + uzn bd, 
—c > ds*a 
(11.17) a = iK’ — ib’ 
ill’s(u, a) = —ilIl’n(u, ib’) = —A,(u, ib’) — u zn id’ 
(11.18) a = id’ ill’s(u,a) = iIl’s(u,ib’) = A,(u, ib’) + uzs ibd’. 


For any one value of a there is a choice between two formulae, and we can 
cover the whole perimeter either using two theta functions with 3}, d’ in the 
intervals (0, K), (0, K’) or using the four theta functions with 5, b’ in the 
intervals (0, $K), (0, 3K’); in the first case we have a further choice, for we 
can use 8,u on CSN and 8,u on CDN or 8.u on SCD and #,u on SND. 











190 E. H. NEVILLE 


With a on SC or ND the choice between functions is more apparent than 
real. Writers from Legendre onwards ignore (11.12) and (11.15) without 
explaining why these alternatives can be ignored. For the final evaluation 
from (11.11) and (11.12) we have explicitly 


d,(b — u) 0,'b 

as A > — » = aloe 

A,(u, 6) ae zs b a,b” 
(6 — — a 

A.(u, 6) j logs (b ¢ =" zcb = 39.6" 


Since 3,(K — u) = 8,K #8.u, tables of 3,u and #,’u have only to be provided 
with the complementary argument K — u to become tables of 3,K #8. and 
—v,K 8,'u, and we use the same entries and do the same arithmetic whether 
we compute A,(u, K — 6) and zs(K — 3b) as 


to 8(K —b— x) 8,'(K — b) 
S59 (K —b +n) . =o 


or compute —A,(u, 6) and —zs bas 


, log 22K (6 + nu) x 0KO'b 
d,Kd.(b — u)’ 0,K8.b 

The same considerations apply to (11.15) and (11.16): tables of 3,4 and #,'u 
provided with the complementary argument K — wu are tables of 3,K du 
and —#,K #,'u 

With @ on SN or CD the process of evaluation is more elaborate and the 
distinction between the alternatives is not trivial. The theta function in 
A,(u, ib’) has the complex arguments ib’ + u and must be dissected before 
II'p(u, 1b’) can be computed. We take the four functions in turn. The theta 
functions are defined in terms of v, where v/}a = u/K, that is, where v = ru/2K 
and we write also 8 = rb’/2K. It is to be noticed that 3,’ib’ means (dd,/du),,-1» 
that is, (dd,/dv),~» . dv/du, and that therefore 


ud,'ib’ = v(dd,/dv) ,.«. 
The functions are defined in terms of v and g, where 
(11.21) q= en" 


but g is a constant of the Jacobian system and variation of g is not contem- 
plated. 
The functions 3,u, 3.u are multiples of 


. 1.2. 2.3. - 3.4 + = 
sinv — q''sin3v+q ‘sin 5v —q'sin7v +... 
1,2 2.3 - 3.4 - 
cos v + gq’ cos 3v + g''cos 5v + q°‘cos7v +... 
and therefore 3,(ib’ + u) is a multiple of 


(cosh 6 sin v — q' cosh 386 sin 3v + gq ‘cosh 56 sin 5v — .. .) 
+ i (sinh B cos v — q' sinh 38 cos 3v + q°*sinh 58 cos 5v — . . .) 





an 


He 


an 


ELLIPTIC INTEGRALS 191 


and #8,(ib’ + u) is a multiple of 


(cosh 8 cos v + q' “cosh 38 cos 3v + g° “cosh 58 cos 5v + .. .) 
— i (sinh Bsinv + q' sinh 38 sin 3v + q ‘sinh 58 sin 5v + ...). 
Hence 
(11.22) II’s(u, 1b’) = 
cosh 6 sin v — q' cosh 38 sin 3v + q° “cosh 58 sin 5v... 
cm Le. a yy pe = 
sinh 8 cosv — g ‘’sinh 38 cos 3v + q° “sinh 58 cos 5v.. . 
oie cosh 8 — 3q° ‘cosh 38 + 5q°*cosh 56—... 
‘sinh 8 — q'’sinh 38 + q’’sinh 58 —... 
and 
(11.23) I’c(u,ib) = 
sinh 6 sin v + g' sinh 38 sin 3v + g° sinh 58 sin 5v +... 
cosh 8 cos v + q' cosh 38 cos 30 + g°*cosh 58 cos 5v +... 
sinh 8 + 3g'“sinh 38 + 5q°*sinh 56 +... 


‘ cosh B + q' *cosh 38 + q cosh 56+... 


arc tan 


Similarly, since 9,u, 3gu are multiples of 
1 — 2¢ cos 2v + 2g‘cos 4v — 29*cos 6v + 29"*cos 89 — ... 
1 + 2q cos 2v + 2q‘cos 4v + 2q*cos 6v + 2¢"*cos 80 +... 
we have 
(11.24) I’n(u, 1b’) = 
2y sinh 28 sin 2v — 2g‘sinh 48 sin 4v + 2q’sinh 66 sin 60 — . 
1 — 2q cosh 28 cosh 2v + 2g*cosh 48 cos 4v — 2g’cosh 68 cos 6v + . 
‘ 4g sinh 28 — 8q‘sinh 48 + 129’sinh 66 — ... 
"1 — 2q¢ cosh 28 + 2q*cosh 48 — 2q"cosh 68 +...’ 


arc tan 
4+ 


(11.25) II’d(u, 1b’) = 
2q sinh 28 sin 2v + 2q‘sinh 48 sin 4y + 29’sinh 68 sin 6v + ... 
1 + 29 cosh 28 cos 2v + 2q‘cosh 28 cos 4v + 2g"cosh 68 cos 6v + . . 
‘ 4q sinh 28 + 8q‘sinh 48 + 12q9’sinh 68+... 
"1 + 2q cosh 28 + 2q*cosh 48 + 2g"cosh 68 +...” 


— arc tan 


If 6’ and wu are real, the functions Il’p(u, ib’) have real values and (11.22) 
(11.25) are formulae from which these values can be calculated. The hyper- 
bolic functions do not retard appreciably the convergence of the several 
series; if b’ is in the range (0, K’), both sinh 8 and cosh n8 are smaller than 
q", and if 6’ is in (0, 4K’), then sinh 2”8 and cosh 28 are smaller than g~". 
The restriction on the path of u« implies that the inverse tangents are all in 
the interval (— 42, $2). 














192 E. H. NEVILLE 


The dissection of the theta functions for the evaluation of elliptic integrals 
is classical; the improvement on current practice lies in avoiding a mixture 
of functions in any one formula. 


12. Light is thrown on the alternatives in (11.11)—(11.18) by the relation 
(6.2) between IIs(u, a) and IIp(u, a): 


- ps(a — «) ps‘a 
(12.1) Iip(u, a) = Ils(u, a) + 4 log nels + a) + #- oe" 


Denote the midpoints of SC, CD, DN, NS by E, F, G, H, and let b = b, bea 
point in SE and ib’ = 6, be a point in SH. 

In the half-sides EC, DG, GN there are points 6,, by, 6, at distance b from 
the corners C, D, N, and we have 


6. = K. — b,, IIs(u, 5.) — IIc(u, b,), 
bg = — K, — b,, Iis(u, bg) = — Id(u, b,), 
= K,+5,, IIs(u,b,) = n(x, b,). 
If 6, traverses SE, the four points b,, b,, 5,, bg together traverse the two sides 


SC, ND, and the evaluation of IIs(u, a) is extended from SE to the two sides 
by means of the elliptic functions ps wu: 


Il 


(12.21)  Ts(u,b,) = Ts(u, b) 

(12.22)  Tis(u, b.) = — Is(u, b) — } kc ae __ cob 

(12.23)  Tis(u,b,.) = Ts(u, 6) + } log ee +e ne 
(12.24) —Ts(u, te) = — Ms(u, 5) — }1o ear ie web 


Since the operation of evaluating the difference 


ps(b — u) ps'b 
} log ps(b + u) ™ ps 5 
from tables of ps u and ps’ u is precisely the same as the operation of evaluatin 
P g 
IIp(u, a) in the form 


from tables of 3,u and #,’u, no practical advantage is to be expected from 
these formulae. 

It is different when we deal with the half-sides HN, CF, FD. On them we 
have points b’,, 6’., b’4 such that 


Y= K,-—VJ*,, Iis(u, b’,) = — In(u, b’,) 
Y= K.+0’,, IIs(u, b’.) = Tc(u, b’,), 
b= —K,-— 0’, IIs(u, 6’g) = — Id(u, b’,). 





—_-_ - hme, = - - a 


~~. 


a 














ELLIPTIC INTEGRALS 193 
Since b’, is imaginary, we take (6.2) in the form 


ry" i , bps a a bps u 
(12.3) iIl’p(u, a) = aIl’s(u, a) + 4 log bese — bese — u bps a. 
Using Jacobi’s imaginary transformation we have 
bes(ib’|c) = i bns(b’|c’), bds(ib’|c) = i bds(b’|c’), bns(ib’|c) = i bes(b’|c’) 


and therefore 


(12.41) Il’s(u,b’,)= I's(u, id’), 
(12.42) II’s(u, b’,) = — Il’s(u, 1b’) + arc tan bea(b'\e) — u bes(d'|c’), 
bns(u|c) 
(12.43) Il’s(u,6’.) = I1's(u, 1b’) — arc tan bna(0jc) + ubns(b’\c’), 
bes(u|c’) 
1) = T'e(as. ab’ bds(6’|c’) (Bh! le! 
(12.44) I['d(u, b’,) = I1's(u, 1b’) + arc tan fans u bds(b’\c’). 


It is far quicker to evaluate a difference 


bqs(b'|c’) _ iia 
arc tan bps(u|c) u bqs(b’\c’) 
than to find an isolated value of a function II’p(u, ib’) by means of a dis- 
sected g-series, and (12.41)—(12.44), unlike (12.21)—(12.24), can be recom- 
mended to computers. 


13. To conclude, we have to consider the integral 


= du 
wm Seer 

“1 upg u 
between arbitrary real limits. If the integral can be expressed as the differ- 
ence between integrals from 0, the evaluation in one of the forms 

~ IIi2, (u2 — %1) + ~~ IIi2, — + = IIi2, 

where Il, = Is(t2,a) — IIs(u;, a) introduces no fresh problems. But since 
the integral has a logarithmic singularity at any point where pq*u = pq’a, 
there is a tacit assumption throughout that there is no such point on the a- 
path. 

If a is not real, this assumption does not come into operation. But if a is 
real, IIs(u, a) is defined as a real integral only for values of u in (—a, a) and 
L is expressible by means of II: only if u and wz: are in this interval, whereas 
the condition implicit in the existence of the integral does not restrict u and 
us separately. The problem is the same as in the integration of 1/x. If neither 
8,(a — u) nor 8,(a + u) is zero for any value of u in (1%, wu), the two quotients 
8,(a — uz) /8,(a — um), and 3,(a + u,)/8,(a + ue) are positive and II), defined as 











E. H. NEVILLE 


J cu a) du, 


41 8,(a@ — u2)d,(a + 1) 
8,(a — u1)0,(a + us 


If there are points };, be, ..., bn in (%, #2) such that wt = qp’a the sub- 
stitution of 


can be compu ted as 





y+ (uz — u)¢ —_— 








0,(a — u2)d,(a + u:) _ ,.\ 9s 
3 log 8,(a — u:)08,(a + ue) + (a — mi) 90g 
for Il, in the formal evaluation gives the limit of the sum 
bie be—e2 Om — em u2 d 
eC awl a= 
™) dite; Om—1 + tm—1 Om+ tm — pp Pq u 


when «, €2,..., €m tend independently to zero. 


REFERENCES 





1. C. A. A. Briot et J. C. Bouquet, Théorie des Fonctions elliptiques, (Paris, 1875). 

2. A. Enneper, Elliptische Funktionen, Theorie und Geschichte (Halle, 1876). 

3. C. Hermite, Note sur la théorie des fonctions elliptiques in Serret, Cours de calcul différentiel 
et intégral (4th ed, Paris, 1894) 737-904. 

4. C. G. J. Jacobi, Fundamenta nova theoriae functionum ellipticarum (Kénigsburg, 1829). 

5. A. M. Legendre, Exercices de calcul intégral (Paris, 1811). 

6. Théorie des fonctions elliptiques, t.1 (Paris, 1825). 

7. H. Lenz, Uber die elliptischen Funktionen von Jacobi, Math. Zeit., 67 (1957), 153-175. 

8. E. H. Neville, Jacobian Elliptic Functions (2nd ed., Cambridge, 1951). 


Sonning-on-Thames, 
England 











-_ 


MIXED PROBLEMS FOR HYPERBOLIC EQUATIONS 
OF GENERAL ORDER 


G. F. D. DUFF 


The object of this paper is the extension to linear partial differential equations 
of order m in N independent variables, of the existence theorems for mixed 
initial and boundary value problems which have been established for systems 
of first order equations in (3). In such mixed problems an initial surface S 
and a boundary surface 7 are the carriers of the two types of data, and the 
number of datum functions to be assigned on T depends on the configuration 
of the characteristic surfaces relative to S and T. 

For the first part of the paper (§§ 1-5) the coefficients in the differential 
equation, the initial and boundary surfaces, and the data prescribed are all 
taken to be real analytic in the variables x'...x”%. In this “analytic’’ case 
an existence theorem is established for boundary conditions of considerable 
generality. We assume that the differential equation is regularly hyperbolic 
with respect to S and 7, a notion which is stated precisely in § 1, and is 
weaker than the usual regular hyperbolic condition. Then the single equation 
of higher order is reduced to a system of equations of first order, of the type 
treated in (3), and the existence theorem there established is taken over to 
obtain the result, which is stated as Theorem 1 in § 5 below. For this purpose 
we require a certain algebraic lemma relating to the characteristic roots. 

The non-analytic problem for regularly hyperbolic equations is treated in 
§§ 6-10, by adaptation of the energy integral method. A general sufficient 
condition for the existence of a solution is given in § 6. As it appears that this 
condition is not always fulfilled, it is necessary to discuss particular cases. 
In § 8 and § 9 are treated two such special problems, each of which is a 
generalization of the known results for second order equations. The first of 
these concerns the problem wherein the number of boundary conditions is 
one less than the number of initial conditions. The second requires an assump- 
tion of symmetry relative to the boundary surface, and the number of boundary 
conditions is half the number of initial conditions. 


1. The differential equation. We consider an analytic linear partial 
differential equation of order m in the N independent variables x*: 


= a"u 
(1.1) Lu = > Qo... >a 9,8 = 0. 
The dependent variable is u = u(x‘). The coefficients 
On) 4)... % 


Received May 3, 1958. 
195 











196 G. F. D. DUFF 


of order h are assumed in this first part to be convergent real power series 
of the real variables x‘. 

Let S : ¢(x*‘) = 0 be an “‘initial’’ surface not characteristic for the linear 
operator L; and let JT :¥(x*‘) = 0 be a “boundary” surface likewise not 
characteristic. We assume that both S and TJ are analytic and that they have 
an (N — 2)-dimensional intersection C. 





The characteristic surfaces G : x(x‘) = 0 of the operator L satisfy the 
equation ‘ . 
x x 
(1.2) Cl x] = O¢m) 1... tm ax” ot ote = 0, 


and in general there will be m (or fewer) characteristic surfaces which pass 
through the edge C. As seen below we assume that there are actually m. We 
suppose that at least ko(Ro < m) of these lie in a fixed “quadrant” R defined 
by S and T: and we select ko of these surfaces G,, (i = 1,...,%o). These 
shall be referred to as “‘select’’ characteristic surfaces, and all others as ‘‘non- 
select.” 

The mixed problem to be studied below is now formulated as follows. 
Define t = $(x‘), x = ¥(x‘), and assign on S Cauchy data for u with respect 
to the operator L: that is, values of u and its derivatives with respect to ¢t 
up to order m — 1 inclusive. Assign on T any ky of the m quantities: 

du au 
"t°****ar 
subject to compatibility conditions of order m — 1 on the edge C. We seek 
a piecewise analytic solution in R of Lu = 0 which takes the given values 
on S and on 7, and is analytic except across the select characteristic surfaces, 
where it is of class C"™—'. 

In order to treat this problem we shall need to assume that the operator 
L is regularly hyperbolic with respect to S and T, in the following sense: 
there shall be m distinct characteristic surfaces passing through C. Another form 
of this condition is available if we consider the coefficients 


Dim) i... tn 


of highest order derivatives in Lu, in which the indices i, take values N and 
N — 1 corresponding to ¢ and x, respectively. To find this condition we note 
that by the theory of first order partial differential equations, the characteristic 
surfaces through C are composed of characteristic strips of the reduced 
characteristic equation 


(1.3) Cild,, —1) = Am) 1;...im Pty...Pin = 0, 


where py = p, = — 1 has been substituted so that x appears in the form 
x = xo(x,,x) — #. On the initial “curve’’ C the initial values for the strip 
elements are found from (1.3) and the conditions 


N-1 
dt = Dy pdx,. 





i . se 


‘ we 


To 


— te “? | oeeed 














MIXED PROBLEMS OF GENERAL ORDER 197 


Since for p = 1,2,...,N — 2, the dx, are independent, we have p, = 0, 
(oe = 1,2,..., N — 2), and only ?, is different from zero. Thus 
(1.4) C,[0,0,...,0, 2, — 1] = 


determines the m values of p,, which we shall suppose are real and distinct 
on C and in a neighbourhood of C. Setting 


Ay = Aim)n-1,N-1 N-1,N.N,...,N 


where the index N — 1 appears k times, we see that (1.4) becomes 
(1.5) OnPs — Omi Ps” +... + (—1) "ao = 0 
That is, the roots ~, of (1.5) must be distinct and real. 

We note that if Lu = 0 is regularly hyperbolic in the sense of Leray (8) 
and if S is spacelike, then Lu is regularly hyperbolic with respect to S and 
to every non-characteristic surface 7. If in the regularly hyperbolic case we 
imagine the edge C to rotate about a fixed point in S, the characteristic 
surfaces issuing from C remain separated: no two touch. For our purpose it 
is enough if these surfaces are distinct for the one position of the edge C. 
Thus our condition is weaker than the customary regular hyperbolic con- 
dition. In fact it becomes equivalent for the case of two independent variables, 
when the edge C reduces to a point. 

Analogously, the normal surface of a regularly hyperbolic operator consists 
essentially of a nest of concentric ovals, such that any line through the origin 
meets the surface in a maximal number of real points. In our case it is sufficient 
if a particular line through the origin, namely the normal to C in S, meets the 
surface in a maximal number of real points. This could be realized, for instance, 
by a surface with multiple points, or by a surface consisting of ovals external 
to one another. 

If we alter the negative signs in (1.5) and consider the equation 


(1.6) Amy” + Am—-1y™ !§ +... + ao = 0, 
we see that the roots y:,..., Ym of (1.6) are also distinct. Let us define the 
first order operator 
re] 
(1.7) Du = + yt, 


which indicates differentiation along the section of a characteristic surface 
by a plane x, = const, (9 = 1,...,N — 2). 

The operators D, shall also be termed select or non-select according as the 
characteristic surface G; and the characteristic root y; are select or not. We 
note the identity 


(1.8) on T] Dew = So sigma t 


where the terms omitted are derivatives of order less than m. In consequence, 
we can write the given differential equation (1.1) in the form 











198 G. F. D. DUFF 


(1.9) I] Dou = Li(u) 


where L;(u) is a linear operator of order m in which no term has more than 
m — 1 differentiations with respect to x and ¢ combined. 


2. Reduction to a system of first order equations. By introducing 
as new dependent variables v, suitable combinations of the derivatives of u, 
we shall perform a formal reduction of (1.1) to a system 


ov 
(2.1) Dao; = Y» Cy + po €15 V4, 

dx’ 
where each D, operator will occur several times, in general, and where only 
derivatives with respect to x, (op = 1,...,” — 2) appear as 0/dx, on the 


right. This system is of the type studied in (3), as the elementary divisors 
of A® relative to A*—' are simple. The dependent variables in (2.1) are also 
divided into select and non-select classes, a variable v being select if the 
operator D, which operates on it in (2.1) is select, and vice versa. The assigned 
boundary conditions will be transformed into 


(2.2) %= ) > aio, + fi 


where on the left shall appear only the select v;. Thus the existence theorem 
of (3) is applicable and will lead to a solution of the original problem, when 
that problem is analytic. 

The formal reduction and labelling of new variables will follow the pattern 
of the Cauchy-Kowalewski reduction to normal form, except for those deriva- 
tives with respect to x. To handle these we have introduced the D, operators 
and will employ them in the fashion of (1). A result of this distinction is the 
following subdivision of the new variables into groups. We define formally 


ef 
(2.3) U4) = Vad... t1...t¢ = D, D,...Dr —_< 4 


ax"... ax“ 


and construct the first order system so as to satisfy these definitions 
identically. 

In the first group we take g = 0; and for higher values of g up to m — 1 
inclusive there is a group of equations corresponding to each distinct array 
1, ..., %, where order is immaterial. 

A fixed ordering a, b, c,...,k,... of the operators D, is used in each of 
these groups. However, as the exact selection of these indices depends on the 
boundary conditions, we shall for the present reserve the choice of select 
and non-select values. 

Certain “commutator” expressions appear and we now define them. The 
symbol 
(2.4) Catgd...nk 11... te-11%] 


shall denote the reduced form of the expression 























MIXED PROBLEMS OF GENERAL ORDER 





a atu a ie 
_ ax“ emanate ax"... ax* 7 ax“ saa tein ax"... axe! 
this reduced form contains derivatives of u of lower orders a, b,...,h,k, 
i; ...%,-1, with coefficients functions of the +z. 


By writing 
Coted...0bt1... te-i 1] 


we shall indicate that the derivatives of u have been formally replaced by 
the corresponding variables 


as defined in (2.3). This is possible, since we will show that all derivatives of 
can be expressed as linear combinations of the variables x, in (2.3). In fact 
we shall prove by induction that all & + 1 derivatives 


ou 
ax*at* 
can be expressed as linear combinations of the k + 1 variables 
Vng...a» Veg...a» ee | Ven...d» 


in each of which one of the k + 1 operations is omitted, and of variables with 
a lesser number of subscripts. To show this, we note that by (1.7) and (2.3) 


k k 
on 
(2.6) Yen...0 = Dy Sa(v) sean + Fes[u] 
h=0 Ox ot 
where 5S,*(v) is the symmetric function of degree h of the & quantities ,, 
Yn, +++» ¥, and with y, omitted. Forming similar equations with b,c, ...,h, 


k omitted in turn, we see that the system can be solved for the kth order 
derivatives of u provided that the determinant |S,*(y)| is not zero. This is 
proved separately in Lemma | below, and thus our assertion is verified. 

In (2.6) F,-1{u] denotes an operator in 0/dx, 3/dt of order k — 1, which 
by the induction hypothesis may be considered to be already expressed as 
a combination of the v’s. 

The groups shall be written with definitions and differential equations in 
parallel columns. For g = 0 we have the “triangular” array of equations 
shown in Table I. 

In Table I a, b,c,...,,...,m denote the distinct numbers from 1 to 
in an as yet undefined order. The operator L,{v] is defined by replacing deriva- 
tives of u in L,(u) (cf. (1.9)) with the appropriate first derivatives or values 
of the v's. This array or group of equations contains m subgroups, the kth 
group containing k equations each with a different operator D,. 


For each array (i), = (t:,...,%,) we have a similar group of equations, 
in which appear on the right side certain first derivatives with respect to 
x_'. We let (a)¢-1 = (a1... 4% -1) and construct Table II, which also contains 


a triangular array, with subgroups of increasing size. 








200 G. F. D. DUFF 


TABLE | 
DEFINITIONS DIFFERENTIAL EQUATIONS 

= Dv = %% 

v» = Diu D0 = ra + Cav] 

v, = Du Dig = Vode 

%% = D.Dyu DW = Vera + Cacolv] 

Vea = D.Dqu Dw ca = Vere + Cocalv] 

Mr = D,Dau Dra = Vera 

Vrn...o = DD, ... Dou D.rn...0 = Veng...a + Carn...o[?] 
Vg...64 = D,D, see Du Dyireg...0 = Vkng...a + Crro...al?] 
Vag... = D,D, cee Du Ding... = Vrng...a 
Unum... = D,Dm ee Dyu, Dam... _ L,{v] + Canm...v[v] 
V,1...4 = D,D; . Dw, DyrPni...0 = L,{v] + Cont..c10) 
Un1...a = DD; . Du, Dy0m1...0 - L,{v). 


The last of the “subgroups” of Table II contains m — g equations, and 
the order of a, b,c... h, k is the same in all groups, that is, for all (7),. It is 
seen that the number of new variables defined in these groups is equal to 
the total number of partial derivatives of u with respect to t,x and the x,, 
up to and including order m — 1. We emphasize that not all groups (4; . . . t¢) 
are here represented, but only one for each set of integers (i, ...%,) without 
regard to order. Thus we may for simplicity assume that 7; < i2 < i: <... 
< ig. 


3. Reduction of the boundary conditions. Let the derivatives of u 
with respect to x up to order m — 1 be paired in order with the numbers 


a, b,c,...,, which label the D, operators: 
(3.1) u Uz Mego see &™ 
a b S aae ee 
We establish an ordered correspondence between the operators of the 
sequence D,, D,,...,D, in Table I and the derivatives u, u,,...,u:"~'. 
The labels a, 5, . . . , & shall be chosen so that to an assigned derivative u,“ 


there corresponds a select operator D,, and vice versa. This is possible, in 
general in a number of ways, since the number of select operators has been 
taken as equal to the number of boundary conditions. It is now assumed 
that this arrangement has been adopted in advance in Tables I and II. We 
repeat that the select v’s are those which are operated on in (2.1) by a D, 
operator which is select according to this scheme. 

















Vv 2 = 








MIXED PROBLEMS OF GENERAL ORDER 


TABLE II 
DEFINITIONS DIFFERENTIAL EQUATIONS 
O"u a 
Te) = = .=Uoy,D% = Velte-s + Cats... tel?) 
ax"... ax a. : 
d- 
Von). = Dy Dx) = — ig Vdaltq—1 + Canis... tl?) 
Ox 
9 
Vas) = Day Dray = ax“ Voalte-1 + Cotgats... te-11?) 
x 
ri) 
Vers) _ DD ’ DH -_ ax Veoal t)q-1 + Catecd4y... tg-117) 
x 
rs) 
Vea(1) = DD wu ’ Dean snd a tq Veralt)¢—1 + Cotgcats... te—117) 
x 
ts) 
Va 4) = DDauw Dara = ax" Verale-1 Cetgdats... ter) 
x 
ri) 
Yer...) = DeDy... Dotto, Daten... = ax“ Veng...0(e-1 1 Catgth....d41... te-119] 
ix 


te] 
Veg...0c) = DyD,... Dato, Drdes...cco = 3 tg Ukho...a(fe—1 + Catene...011... te—11?] 
x 


Ung...) = DyD,... Dato, Didrg...cc0 = ee + Crtgn...0%.. te-1 1] 

These boundary conditions must be reduced to the form (2.2) where the 
select v's appear only on the left. Let us consider first the group ¢g = 0: for 
the other groups the calculations are similar. If u is given, D, and » shall be 
select; and then v = u is a boundary condition of type (2.2). If u is not given 
then no boundary condition for v will be needed, as D, is then non-select. 

If u, is given then D, is select. There are two cases, according as D, is select 
or not. If D, is select, then v, = u, + yqu, and v», = u, + yu, are both 
assigned and these conditions are of the form (2.2). If D, is not select, we 
eliminate u, between these two relations and find 


(3.2) v, = yo, + ( _ 1) Us, 
Ya Ya 
which again has the form (2.2). However, if u, is not given, there are two 
other cases. If D, is select, and D, is not ,we can solve (3.2) for v,, which 
is then the single necessary boundary condition of form (2.2). If neither D, 
nor D, is select, no boundary condition is needed. 











202 G. F. D. DUFF 


We may now proceed by induction from one subgroup to the next. If the 
boundary conditions for the kth subgroup have been put in the form (2.2) 
then all select variables of that group can be expressed in terms of non-select 
variables. Thus in treating the next sub-group we can allow all variables of 
the preceding groups to appear on the right side of the boundary conditions, 
as the select ones can later be eliminated by means of the preceding boundary 
conditions. If in (2.6) we replace the quantity F,_,[u] by its formal equivalent 
in terms of the v's, we see that only those v's of the preceding groups will 
appear. Consequently this term of (2.6) may be considered as non-essential 
in the remainder of the calculation. 

Assuming then that the result holds for the (& — 1)st subgroup, let us 
prove it for the kth subgroup. We divide the k equations (2.6) into select 
and non-select categories according as the v variable on the left is select 


or not. All derivatives of u, u,,...,u, with respect to ¢ are known, or 
select, on 7, according as u, u,,...,u,™ is select or not. Thus if & of the k 
quantities u,u,,...,u;“ are select, then A of the derivatives written explicitly 


on the right side of (2.6) are select. Let us pick out the k — h non-select 
equations of (2.6) and solve them for the k — h non-select derivatives of 
u in terms of the select derivatives of u and the (non-select) variables v of 
these equations. The possibility of this depends on the non-vanishing of a 
determinant of which the elements are symmetric functions of the y’s, with 
one of the y’s omitted in each row. Supposing, as will be shown in Lemma 1 
of § 4, that all such determinants are different from zero, we can carry out 
this inversion of the non-select equations (2.6), and then replace the k — h 
non-select derivatives of u in the h select equations of (2.1) by the expressions 
so found for them. These h select equations will then take the form (2.2) 
since all earlier groups of select variables can be eliminated from the right 
sides. 

Now consider these groups with g > 0. As all differentiations with respect 
to x, (pe = 1,..., N — 2) are tangential to 7, the derivatives 


at u/ax" ... ax“ ax" 


are select according as 3” u/dx" is select, or not. Thus equations similar to 
(2.6) can be written down for any such group, and the coefficients of the 
terms shown explicitly will be exactly the same, and the structure of the 
operator F,_,{u] will be unaltered except that instead of u as argument we 
will have 


O"u/ax" ... ax“. 


It follows that the boundary conditions for all groups with 0 < q < m — 1 
can be put in the same form (2.2). 


Remark. Suppose that the & linear boundary conditions are linear and 
independent relations among the m quantities u, u,, u,:,...,%:""", on T. 











fe 











MIXED PROBLEMS OF GENERAL ORDER 203 


Such a system of boundary conditions can be reduced to a triangular standard 
form 


aAi-l1 
(3.3) ug” + 2d Co, tte” = gi, (¢ = 1,..., ke), 


A 


where the orders h, of the leading derivatives form an increasing sequence: 
hy < ha < ... < Iiy. 


In addition, the indices 7; are all less than h,, while the coefficients c,, are 
analytic functions on 7. 

To show that this system of boundary conditions can be expressed in the 
form (2.2), we need modify the previous working only slightly. In the array 
(3.1) we choose as select operators D,,...,D, those corresponding to the 


u,“®, 


We then commence with the quantity u: if and only if it is given in the first 
array g = 0 of Table I, a boundary condition is required. Now, considering 
uz, we see that if one of the h, = 1, there is a condition 


Uz + Cis = Qi, 


and if u is given it may be replaced by its given values while, if it is not, 
the corresponding variable v of Table I is non-select and so may appear on 
the right of the boundary conditions (2.2). Thus the form (2.2) is attained 
in either case, as in the preceding calculations. 

Proceeding by induction on h,, we see that in the typical condition (3.3) 
the terms u,“’® are either non-select, in which case they are allowed on the 
right side of (2.2), or else they can be expressed, by means of the boundary 
conditions already standardized, in terms of non-select variables and given 
data. As remarked earlier any variable of a previous group can be allowed 
on the right side of a boundary condition in the course of such a calculation. 
This completes the demonstration that (3.3) can be reduced to the standard 
form of the boundary conditions for the system of first order equations. 


4. A lemma on symmetric functions. To justify the reduction of the 
differential equation as well as the boundary conditions, we establish a lemma 
which is required in its most general form in the preceding discussion of the 
boundary conditions. 


LEMMA 1. Let k distinct numbers y, # 0 be given, and let s,*(y) = s,* denote 
the elementary symmetric function of degree, r of all k — 1 y's with y, omitted. 
Then every subdeterminant formed by deleting an equal number h(0 < h < k — 1) 
of rows and columns from the k X k determinant 


(4.1) Ist (y)|, 


is different from zero. 








204 G. F. D. DUFF 


The numbers y, corresponding to the deleted rows are the select y,; for 
convenience we shall denote them now by c, while retaining y, for the / = k—h 
non-select numbers. The deleted columns refer to the assigned derivatives 
among %, ts, Mz,..., Us°"™. 

Let s,(c) denote the elementary symmetric function of degree r of the c,: 
if we now delete the / select rows we observe from (4.1), that all h of the select 
¢, are present in each of the other rows. Let ¢,*(y) = o,* denote the elementary 
symmetric function of degree r of the non-select y's, with y, omitted. The 
following property of s,* is evident: s,* is the convolution of the s,,(c) and 
or—~m*(y) of all lower orders: 


(4.2) st(7) =D, sal(c)o%n(1). 


Now let ¢, denote the column vector with / = k — h components a,*. The 
k X (1 — h) matrix (the select rows have been deleted), may now be written 


k—-1 
» in Si7x—i-1] - 
i=0 


We note that ¢, = 0 for r > /, so that we may write this array in the form 


(4.3) (Sooo, Soo1 + $100, Sov2 + $101 + S200, ..., St-100 + ~~~ + Soos-1, 
$109 +... H+ S10y-1, « « « » SxOr—2 H Sp—101-1, Sn 1-1). 


If the select columns are deleted, and the resulting square determinant A 
expanded, we see that it takes the form of a sum of / X / determinants with 
columns the o, (r = 0,1,..., —1). Since any one of these with two equal 
columns is zero, it follows that the only non-vanishing determinant among 
them is |ooo: . . . ¢;-1|. Therefore every surviving term has this determinant 
as a factor. However elementary reasoning shows that 

(4.4) looo1..- Op-1| = + I] (Ya — Yo) # O. 


YaF¥> 


The array (4.3) is the symbolic product of (4.4) with the array 


Oe 2. Cain ee. oy ae 
0 So S51 Se. ° ° Sp 0 ° ° oe. 
(4.5) a ae i a ca er 
a? ec se. es oo « « 2 


of 1 rows and kh + / columns, according to the formal rules of determinant 
multiplication. We have therefore to show that the / X / determinant which 
remains when any h columns have been deleted from (4.5) is not zero. 
This / X 1 determinant has the form of the representation of a Schur 
function {A} corresponding to a certain partition (A) consisting of # numbers 
A1,..-,A, arranged in decreasing order. For the theory of partitions and 
S-functions we refer to (11, chapters 5, 6). The partition (A) is best defined 








we SS UTC 


=_ 


A 


rw ee 6 











MIXED PROBLEMS OF GENERAL ORDER 205 


in this case by means of its conjugate partition («), which consists of / positive 
integers u;, not necessarily distinct, arranged in decreasing order. We set 


3 oe h— d(t) 
where d(i) is the number of select, or assigned, derivatives of the sequence 
U, Uz, Use,» ~ » Ug” 


which are encountered before the ith non-select derivative. From (10, chapter 
6, 3.3) we see that the Schur function 


(4.6) {A} = |Suc—e4s! 


has the form of (4.5) after deletion of the select columns and transposition 
about the secondary diagonal. 
On the other hand, from (10, chapter 6, 3.1) we have 


lejvt*-4 | 


(4.7) ISue—t+sl = {A} = lc] 


’ (4,j =1,...,h), 
where / is the number of select c,’s and i and j are indices of position in the 
determinants. We recall that the c, are all distinct; the denominator in (4.7) 
is the Vandermonde’s determinant which is equal to 


+I] (c,-<,), 
wt 


and so is not zero. The numerator is a slightly more general type of deter- 
minant, which has been studied in (1) and shown to be different from zero. 
For this it is necessary that the c, should be distinct and positive, and the 
powers A, — 7 distinct, and these requirements are satisfied since the A, are 
non-increasing with j, while the c,, being the select y,, are positive and dis- 
tinct. A direct proof that the Schur function {A} is a symmetric polynomial of 
the c, with non-negative coefficients has been given recently in (8, Theorem 1). 
Thus (4.7) is different from zero in our case since the c, are all positive. 
Combining this with (4.4) we see that the original subdeterminant of (4.1) 
is not zero, and this concludes the proof of the lemma. 

The special case h = 0 is needed in connection with (2.6), and a sequence 
of applications with various values of 4 and /, one for each subgroup of 
Tables I and II, is needed in §3 as stated there. 


5. Verification of the solution. By (3, Theorem 3), a piecewise analytic 
solution of (2.1), satisfying (2.2) and appropriate initial conditions, exists. 
Let the solution of (2.1) with given Cauchy data and boundary conditions 
(2.2), which is defined by the piecewise analytic expansions of (3, Theorem 3), 
be constructed, and let us show that the solution u of (1.1) which we seek 
is actually the component » of the first equation of the first group of Table I. 
To show that » satisfies (1.1) we shall verify that the defining equations in 








206 G. F. D. DUFF 


the left columns of Tables I and II hold, in succession, and use the last sub- 
group of equations of the first table. When the “‘definitions” are re-established 
the various boundary conditions will be automatically satisfied, in view of 
the equivalence (2.6) between the derivatives of u of a given order, and the 
variables v, .. . of the same order. Thus it will be established that 


(5.1) “=D 


is a solution of the original problem, since the algebraic verification of the 
initial conditions will be trivial. We shall use the uniqueness property of 
solutions of the first order system which are analytic on the closure of the 
sector domains. 

Consider first Table I, and let us verify the relations in the left-hand 
column subgroup by subgroup. The first such relation, namely v = u, is 
taken as a hypothesis, or rather a definition of u. The second definition of 
the second subgroup is precisely the first differential equation and so is valid. 
To show that the first definition of the second subgroup holds, let us define 


(5.2) £, = Dyu = Fy 
Then £, is piecewise analytic on the closure of the sector domains R,. Also 


Dé = D,Dyu — Dar» 


(5.3) _ D,D.u + Car[u] =e C.»[v] 
= Une — Van + Car(u] os Cov) 
- Caolé], 


where we have used the first three differential equations of Table I. With 
Cylu] = D,D,u — D,D,u = aD,u + BD,u, 


where a and @ are certain coefficients which we need not calculate explicitly, 
we have 


Ca[v] = av, T Bu, 
and therefore 
CaolE] = at, + Bé,. 


Now in this case, &, = Du —v, = 0. Thus £&, satisfies the homogeneous 
linear equation 


(5.4) Diks = Bs). 


The initial conditions for £, are also homogeneous, as follows from the defini- 
tion of initial conditions for the v variables. If D, is select, there is a homo- 
geneous boundary condition for — on 7. In this case only, discontinuities 
of the derivatives of £, across G, are in principle permitted, but the expansions 
of (3) applied to this equation show that all such jumps are here zero. Since 
G, is the only characteristic surface of (5.4), it follows that & is analytic 
everywhere and so identically zero. 








MIXED PROBLEMS OF GENERAL ORDER 207 


To verify the third subgroup of definitions, note that the last of these 
relations is now equivalent to the last differential equation of the preceding 
subgroup. Define 


tf. = D.Dyu “te = Dm» — Ven; 
anne beg = D.Dqtt — Veg = Dg — Vea ; 
then 
Dakéco _ D,D.Dy,u _ D.Ve 
(5.6) = D.D,D.u + Caco [tu] _ Dive 
- Dra + Caco(u] “a Cacolv] 
_ Cacol€], 


and likewise 
(5.7) Dokca = Coeal€] ; 


using the differential equations and previously established definitions. Here 
the C,.,[t] expressions are linear homogeneous in the £ variables with less 
than three indices: since all one-index £’s are zero, (5.6) and (5.7) form a 
linear homogeneous system for £,», £... Again, these functions satisfy homogen- 
eous initial conditions. As above it follows in either case that £,, and £,, 
vanish identically. 

The inductive procedure for the kth subgroup is similar: the last definition 
of the subgroup is true in view of the previously established definitions and 
the last differential equation of the preceding subgroup. We define k — 1 
variables &, »,.-..-, €9...¢ as follows: 


Exn..» = D,D,... Dy — ren..2, 
5.7 
( bag... = D,D, see Du — Vg...a» 


there being a different D operator missing in each of these sequences of 
differentiations. Then 


Daksr..» = DaD,D,... Dou — D %n..» 


(5.8) = D,D,...D,D.u + Carn. alu] — Daren...» 
= Ding eT Cara _»(u] ~ its Cara...ol?] 
= Caxn...of€] ’ 
and there are k — 2 similar equations of which the last is 
(5.9) Drakes... o™ Crrg...cl€]. 
Here the C,,,..»[¢] are linear expressions containing the &,..»,..., bno,..2, O80 


well as those of lower order (which are now known to be zero). Thus (5.7),..., 
(5.8) form a self-contained linear homogeneous system, with homogeneous 
initial and boundary conditions. Since the &,..»,..., &o...¢ are analytic on 
the closure of each sector R, they are identically zero, in view of the unique- 
ness theorem in (3). 











208 G. F. D. DUFF 


The proof that the last subgroup of defining relations holds for the solutions 
of the first order system is similar to the earlier steps of the induction. The 
only difference is that the operator L,[v] replaces the variable v,,,... in the 
general step. Since this quantity does not appear in the final form (5.8) and 
(5.9) of the equations for the ¢’s, this change has no effect on the result. This 
shows, then, that all defining relations of Table I are valid. 

Let us show that the defining equations of Table II hold for each index 


group (i), = (4:...%,) by induction on g for each of these groups. Let 
(t)g-1 = (a1... 4% -1) and let us assume that the result has been proved for 
the (7),: group. First define 

a*u 
(5.10) Eide = ax"... ax ~~ a, = axe (Men — Vie: 
We see that 

C) 
(5.11) Dekivg - D, poy Vi)qg-1 ~ Dar 

dx * 


CG] te] 
= Das ig Met wa ax Valide-1 ~~ Cats... tgl?], 
by the first differential equation of the group (i),. However the right side 
of (5.11) contains only v variables of g-order (i),; or less, and in view of 
the definition (2.5) of the commutator operator, this side of (5.11) will be zero, 
since all variables 


Uide-1 


have been proved equal to the corresponding partial derivatives of u. It 
follows by differentiation of the initial and boundary conditions that 


Ew, 


vanishes identically. 
For the typical kth subgroup of Table II, we have a system of k equations 
involving the variables 


Een...0(0¢ = Ven...0(¢-1 — Veh...0( Der 


(5.12) dx** 
- 5 
ax‘ 


bxg...0(t¢ = Veg...a(t)q-1 — Vkg...a( te: 


Then, for instance, from the first differential equation of the subgroup, we 
have 


t) 
(5.13) Daktn...0(0¢ =D, ax te Moe nes Drn...0(00¢ 


C) 


= D, te Uen...0(1)¢-1 he Ven...a(q-1 ~~ Catgth...0()¢-110)- 
Ox Ox 





Lg 


ao>- 6 the. © se A A fF we 











MIXED PROBLEMS OF GENERAL ORDER 209 


Now the right side contains a commutator which includes only v variables of 
order g — 1 or less with respect to the x”, or else of order g but from a pre- 
ceding subgroup of the gth group. As the identification of all these with the 
corresponding derivatives of u has already been made, we see by the definition 
(2.5) that the right side of (5.13) reduces to zero. Similarly, all other right 
sides, obtained by differentiation of the quantities in (5.12) by appropriate 
D,...D, operators, are seen to vanish. Homogeneous auxiliary conditions 
are applicable as before, and it follows that the variables £— in (5.12) vanish 
identically. 

Proceeding thus by induction we make all the identifications of the various 
qg groups, and so identify all derivatives of « = v with the appropriate v 
variables as foreshadowed in (2.3). This proves that u satisfies all the initial 
and boundary conditions. It remains now to show that wu satisfies the original 
linear partial differential equation of order m. However this follows at once 
from the very last differential equation of Table I and the definition (1.9) 
of the operators L,(u) and L,|v]. The existence proof is therefore complete. 


THEOREM 1. Let L(u) = 0 be an analytic linear differential equation of order 
m which is regularly hyperbolic with respect to analytic initial and boundary 
surfaces: S:t = 0 and T:x = 0. Let ko characteristic surfaces G, issuing from 
C = S(\T into the region R be selected, and let ko of the quantities 


Gia <<«.,a9 


be assigned on T in addition to Cauchy data on S. Then there exists a piecewise 
analytic solution u assuming the given initial and boundary values, and analytic 
except across the G, where it is of class C™—. 


This analytic solution is piecewise analytic in the strong sense described 
in (2, § 10); that is, it is analytic on the closures of the distinct sector domains 
D, which separate the select characteristic surfaces. The solution must still 
be proved unique even within this class of functions, since there are many 
ways of setting up the corresponding first order system, and it is necessary 
to show that these distinct ways all lead to the same discontinuities of 
derivatives across the select characteristic surfaces. 

To prove this, let us suppose that u is piecewise analytic in the above 
strong sense, that u is C”~', and analytic except across the select G,; that 
L(u) = 0 and that homogeneous Cauchy data and boundary conditions have 
been assigned. Then wu and its derivatives satisfy the system of Tables I 
and II, which is arranged according to any one of these alternative ways. 
However the data entering this system are all zero. Since a solution of the 
system is unique in the strongly piecewise analytic function class, (3) the 
solution of the system is identically zero. Hence u = 0 as was to be proved. 

As a further corollary we add that if every characteristic surface G, issuing 
into the domain is select, then our solution is unique in the class of C* func- 
tions. This now follows at once from the uniqueness in the C' class of solutions 











210 G. F. D. DUFF 


of the first order system, when every characteristic root is real and not zero, 
and when all positive roots are select (3, § 10). 

We have seen in §3 that lower order derivatives with respect to x can be 
permitted to appear in the boundary conditions. It is also true that lower 
order derivatives with respect to the other NV — 1 variables can be accommo- 
dated in the same way. This is possible since we could establish the definitions 
required in working back to the mth order equation in a lexicographic sequence 
which takes account of first, order of the derivative, second, index (7) of the 
group, and third, ordering of the D, operators within each subgroup. However 
it is not possible to permit oblique derivatives of an order equal to the highest 
order which occurs in the boundary condition, as can be seen even for hyper- 
bolic equations of first or second order (2). Such conditions will lead to in- 
consistencies if directional derivatives involved have characteristic directions. 

We comment on the fact that non-analytic ‘‘kinks’’ can be chosen to occur 
on some, but perhaps not all, characteristic surfaces issuing into the region. 
Each such characteristic surface may be thought of as corresponding to a 
particular kind of wave, generated at the boundary. For a vibrating beam 
there will be flexural and shear waves, travelling at different speeds, for 
example. Our theorem shows that there is a solution, satisfying one boundary 
condition, in which only one type of wave is generated. With a suitable bound- 
ary condition, it is quite possible to have a solution in which only some other 
type of wave arises. In a physical problem, the appropriate linear combination 
of these solutions would have to be selected by some further conditions; usually 
these would be additional boundary conditions. 


6. The non-analytic case. An existence theorem for hyperbolic equa- 
tions of order m has been given by Leray (9) under an assumption of finite 
differentiability, and by means of analytic approximations for which uniform 
estimates are obtained through the use of energy integrals. More recently 
GArding (5) has given a direct existence proof using only the energy integrals. 
These calculations refer to the Cauchy problem, and we shall here investigate 
their application to mixed problems as in the preceding sections. For second 
order equations, this aspect of mixed problems has been treated, for example . 
in (2, 7, 10) 

The results which we obtain do generalize the known theorems for second 
order equations in two different ways, which will constitute Cases I and II 
below. However it has not been possible to attain the generality of the theorem 
for analytic equations, and a considerable gap remains to be filled. It should 
be remarked that for the case of two variables a thorough treatment by 
Picard’s method has been given by Campbell and Robinson (1), covering 
semilinear equations as well. The energy integral method has been applied 
to the linear problem in two variables by Thomée (12). 

In contrast to the analytic case, we must now assume that the differential 
equation (1.1) is regularly hyperbolic in the sense of Leray: that is, in effect, 











MIXED PROBLEMS OF GENERAL ORDER 211 


that there exist timelike directions and that the normal cone is real and 
has no multiple points except the origin. This criterion will be fulfilled if the 
initial surface S is so situated that the equation is regularly hyperbolic (in 
the sense of § 1) with respect to S and to every surface T meeting S in a 
smooth hypercurve C. 

When applied to a mixed problem, the energy integral formulae are modified 
by the presence of a boundary integral taken over the surface T. To complete 
the estimates we must show that this boundary integral form is semi-bounded. 
We therefore begin with an algebraic study of this boundary term, and will 
use the elegant notation of Hérmander (6) for the algebra of energy integrals. 

Let the terms of highest order m in (1.1) be written 


. my FE 
(6.1) P(D)u, D;=73,)" 
where P(D) isa polynomial of order m; and let Q(D) be a real polynomial 
operator of order m — 1 in D. Then the quadratic form 


(6.2) F(D, D) ua = P(D) u Q(D) a — P(D) a. O(D)u 
is a divergence expression 
(6.3) —i)>> 4/ax’(G,(D, D) ua), 

| 


where the operators G,(D, D) are related to F(D, D) by the equation 
(6.4) F,§) = DE 6s — £) G, GH). 


j 


Here {, = &, + in, and §, = &, — in, are complex variables dual to D,. In 
forming (6.3) we shall assume that the coefficients of P(D) and Q(D) are 
constants; this assumption can later be relaxed. 

Writing the differential equation (1.1) in the form 


(6.5) L, = P(D)u + B(D)u = f(x), 


where B(D) is an operator of order less than m, we integrate the expression 
2 Re Lu Q(D)d@ over a lens-shaped region R such as is described in (3, Figure 
2). This region is bounded by initial and final surfaces Sy and S, (¢ = const), 


and by a portion 7, (x = 0) of the boundary surface 7. We find, on the 
one hand 


(6.6) if F(D D)uaidVv = f ow, D, u, a, f) d V, 


where the quadratic form Q(D, D, u, a, f) is of order m — 1 or less in the 
D,, and contains factors linear in f. On the other hand, by (6.3) we have 


(6.7) if Fw D)uaidV= J. GoD) uaa S, 


- f G.(D D) uad S,, 
T% 











212 G. F. D. DUFF 


the minus sign in the last integral being due to the convention that x shall 
be measured as increasing along the interior normal to 7,. Comparing (6.6) 
and (6.7), we find 


(6.8) S.ew D)uadS,- J eo D)uadS, 


-f GD D) uaa s,+ f Q(D, D, u, u, f) d V. 


The method of Leray and GArding is based on the fact that if the auxiliary 
operator Q(D) is so chosen that the sheets of its normal cone separate the 
sheets of the normal cone of P(D), then the integral over S, is positive definite 
(4, Lemma 3.1). For this purpose we may assume that the coefficients of 
i"D,” in P(D), and of i*"'D,”""' in Q(D), are both + 1. 

Now let us suppose that P(D) and Q(D) have variable but sufficiently 
often differentiable coefficients. Then (6.3) is modified by the addition of 
derivatives of order lower than m — 1, and a quadratic form in such deriva- 
tives of u, @ will appear in (6.7). These terms may be absorbed in the integral 
over R, in (6.8), which is thus not changed in form. Also, in this case, the 
integral over S, can be made positive definite in all derivatives of u of all 
orders < m — 1, by the addition of derivatives to the integrand G,(D D) ua@ 
of orders not greater than m — 2. Again, such terms in the new integrand 
G,(D D) ua over S, can be counterbalanced by terms in the volume integral 
containing derivatives of order no higher than m — 1. Consequently (6.8) 
holds unchanged in form for the case of variable coefficients as well, with 
a somewhat different quadratic form Q(D, D, u, a, f) in the volume integral. 
Set 


(6.9) E,(t) = J. Lula Se 


the summation being taken over all essentially distinct partial derivatives 
of « of order less than k + 1. By the positive definiteness of the integrand in 
the integral over S,, we find, (4, Theorem 2.1; 5, Theorem 3.1) that there 
exists a constant c > 0, depending only on the differential equation and the 
domain, such that 


(6.10) f G0, D) wad S,> E,W) — Emad, 
St 
for every ¢ and all u € C™"'. 


We may express E,,_»(#) as the integral of the time-derivative of its inte- 
grand: this leads to an estimate 


(6.11) En_s(t) < En-2(0) + K J En_s(t)dt. 


It now follows from (6.8) and (6.10) that 





(€ 


si 


( 





MIXED PROBLEMS OF GENERAL ORDER 213 


Ew s(t) <¢Enslt) + f G(D,D) uaa s, 


< ¢Em-s(0) + eK f ‘En-i(r)dr 

(6.12) : 

+f GAD, D)uadS, +f O(DDuaf)dv 
Se Rt 


+ f G(D D) uad5S,. 
Te 


By Schwarzian estimations of the third and fourth terms on the right-hand 
side of this last inequality, we find 


(6.13) Em-s(t) <c* Em-s(0) + KiEm-s(0) + Ks J Ens(r)dr + In(t) 


t 
< Ko+ Kf, En-1(rt) dD + Ip(t) 
where we have written 
(6.14) I(t) = f G,(D, D) uad S,. 
T: 


Here Ko and K; are constants depending on all the data of the problem, but 
not on 4. 


Now let us suppose that we can prescribe a similar estimate 
t 
(6.15) In(t) <Ke+Ke J Ena(r)ar, 


for the boundary integral. By standard methods (9; 4, Lemma 1.2) we can 
now establish a conventional L? estimate 


(6.16) En—1(t) < (Ko + Kx) exp [(K2 + Ks5)é]. 


Integration with respect to ¢ leads to L* estimates over the entire domain 
R,, and the process of solution, using analytic approximation together with 
Sobolev’s lemma and Ascoli’s selection theorem, then proceeds as in (9, 
p. 162). Further details will not be presented here. 

To summarize this discussion, we state 


LemMa 2. Let initial and boundary surfaces S and T subtend exactly k charac- 
teristic surfaces of the regularly hyperbolic equation 


(6.17) Lu =f, 


the surfaces and functions present being of class |}N| + h + m in the closure 
of the region R. Let zero Cauchy data be assigned on S, and let k of the derivatives 
U, Uz,..., Uz"—' be assigned the value zero on T. Then if the boundary integral 
Ip(t) satisfies an estimate (6.15), there exists a solution u € C**™ of (6.17) 
which satisfies these conditions. 











214 G. F. D. DUFF 


We note in passing that the problem with non-homogeneous boundary 
conditions can be reduced to the form above by subtraction of a function 
which satisfies the initial and boundary conditions. 

In Cases I and II below we shall need quite different methods to establish 
the inequality (6.15). The work falls into two parts, namely, an analysis of 
the case of constant coefficients, and an adaptation of this case to the more 
general situation with variable coefficients. 


7. The boundary form. We consider first the particular case of constant 
coefficients in the differential operator, and use Fourier transforms to estimate 
the integral over T,. Let us denote by # = &(£,) the Fourier transform 


a(é,) = formate’ t) dx® dt 


of a function of x, and ¢, defined as equal to u(x’, t) on T, and zero elsewhere. 
Here also 


(&,x) = tt + >> &x,. 


The inverse transform is easily written down, and we note that &, is now 
dual to D, = — id/dx,. Thus, we shall need to distinguish differentiation 
with respect to the transverse variable x (across T), by writing 


G,(D, D) = G (D,, D.,, D,, D.). 


Now Parseval’s theorem (6) shows that 
(7.1) I p(t) = f G(D,, De» D,, D.) u u d S; = fee. D,, £4, D,) a a dS;. 
Tt 


This last integrand is a quadratic form in the variables D,’ %, (j = 0,1,..., 
m —1), which are independently defined when regarded as functions of 
the £,. Thus in the case of constant coefficients we are led to study the alge- 
braic properties of this quadratic form. We single out the variable ¢, = &.+ in: 
and set all other variables ¢, in G, equal to their real parts £;. From (6.4) it 
is found that 


- F ts Sz» Sts a 
(7.1) GE, Sz, Eu fe) - PG. Sabu bs ) 
{2 — $: 
Let us now write 


P(g) = Pz) = P(t) = Dd a(t", 
(7.2) = 


Olt) = Olt) = Oke £) 2 b (Edt 


Then, dropping mention of the £; variables (¢ # x), we find 














MIXED PROBLEMS OF GENERAL ORDER 





Gelt., §.) = P&2QCs) — PGs) OSs) 





(ft. init fe) 
m m—1 m m—1l 
) at: > b,f2 = Zz af: p bf: 
= r=) s=0 r=0 s=0 
ie = f, 
m mil -8.7 spr 
ee aa cfs) 
2 E an (tBi=it), 
2d 2, 2 t: at Ss 
Since the expression in parentheses is the sum of the geometric series 
Ir—s|—1 
ik ——— 
Dy Ps 


(the + sign being taken if r > s, the minus if r < s), we find, after some 
rearrangement, 


m—1 


(7.3) G.(ts, §2) = 2 oe 33 85, 


?.¢~ 
min(p,¢) 
Cog = > (b, Ap+g+i-s — Gs bp+¢+1—2)- 


s=0 


Here a, is taken as zero for s > m, while 6, = 0 for s > m — 1. It may be 
noted that c,, = cy,(£;) is homogeneous of degree 2m — 2 — p — q in the 
variables &;. 

When k homogeneous boundary conditions are assigned, then in effect k 
rows and columns of the coefficient matrix c,, are deleted, since the corre- 
sponding terms fall out. The estimate we require is essentially that the 
remaining, or residual, quadratic form, be non-positive. We consider two cases 
here: when it is negative definite, and when it is zero. 

Let the residual form be negative definite; then by altering the k deleted 
rows and columns we can arrange that the new (enlarged) form should also 
be negative definite. Then an estimate of the type of (5, Lemma 3.1) will 
hold, though in the opposite direction. This property of the constant-coeffi- 
cients case, which is a local property for variable coefficients, enables us to 
deduce the analogue of (5, Theorem 3.1) which applies to the case of variable 
coefficients. It now reads 


(7.4) In(t) = [ G.(Du De DoD.) wad S, 
< — cE,-1(t) + cE,-2(t), c> 0, 
where ; 
(7.5) E,(t) = > |D*ul’d S,. 
Te lal<s 


We must remember that the assigned homogeneous boundary data are in- 
serted on the left in (7.4). The first term on the right in (7.4) can be replaced 








216 G. F. D. DUFF 


by zero; we shall now estimate the second one, in much the same way as 
in (6.11). Let each point of 7, be joined to a point of Sp by a line x + ¢ 
const, x, = const, p = 1,..., N — 2; and let us express the integrand of 
E,,-2(t) as the integral of its derivative along this line. Thus derivatives of 
order m — 1 or less appear; and integration over 7, leads to integrals over 
a “triangular’’ portion of R, making their appearance. Application of Schwarz’ 
inequality now leads at once to the estimate (6.15) which we require. 

This method of negative definite character for the residual quadratic form 
will be used in Case I below. 

For Case II, where the residual quadratic form vanishes identically at 
every point of 7, we must use a different approach to gain the result for 
variable coefficients. Let us use the fact that the coefficients are continuous, 
and, given a fixed function u, together with an arbitrary positive number e, 
subdivide the boundary surface 7, into a finite number of portions 7}, in each 
of which the oscillation of the coefficients is less than ¢. Select a point xo" 
in each 7,, and write 


f G(D,D,D, D, uid Sr 
Th 
= | G.(D.D,D.D,) uaa S, 
Ta 


+ | R(D,, D,, D,D,) uid Sz, 
Th 


where 
G., 


is the boundary form with constant coefficients evaluated at xo, and the 
coefficients in the remainder term R(D, D)u@ are all less than ¢ in magnitude. 

Suppose now that wu satisfies the homogeneous boundary conditions; then 
by hypothesis the first integral on the right vanishes. The second integral 
can be estimated to be less than 


em > |D*ul*d Sp. 
Tr |\a|<m—1 
It follows by summation over the 7, that the boundary integral J,(¢) is 
less than 


€ mE, -1(t) 


in magnitude. However, u and therefore £,,_,(¢), are fixed, and « is arbi- 
trary. Consequently J,(¢#) must vanish. For the variable coefficient problem, 
it is therefore sufficient that the coefficients at each point of 7, should lead 
to a vanishing residual matrix. We employ this result in Case II below. 


8. Case I: k= m-—1. The corner element Cp—i.m-1 of the coefficient 
matrix of the above quadratic form is a polynomial of degree zero in the 




















MIXED PROBLEMS OF GENERAL ORDER 217 


£,; it is in fact a,,b,—1. If this element is negative then the conditions of the 
lemma will be fulfilled when the function and its first m — 2 derivatives 
U, Uz,..., 42°" are given as zero on 7. Now this boundary condition will 
be appropriate if k = m — 1 characteristic surfaces lie between S and T: we 
assume this. Consequently one characteristic surface lies between T and the 
portion of S prolonged beyond the edge C = S/\T. Define an auxiliary 
co-ordinate z = tcosa + x sina, and let a range from a = 0 toa = }r. The 
coefficient of D,” in P(D) is equal to 1 when a = 0 and z coincides with ¢. Since 
this coefficient vanishes when the surface z = const is characteristic, and 
changes sign at a simple characteristic surface, it vanishes once for 0 < a < 44 
and is therefore negative. For a = 4} it is a, which is thus negative. 

To make 6,,; positive, we should, in view of this discussion and of the 
fact that the characteristic surfaces of Q(D) must separate those of P(D), 
arrange that all m — 1 of these surfaces should lie between S and 7. That 
is, T must be spacelike with respect to Q(D), or equivalently the normal to 
T must be timelike. We shall assume that it is possible to find an auxiliary 
operator which has this property. For example, if m = 2, the order of Q(D) 
is 1 and the characteristic surface can be chosen to have any direction between 
S and T. In the general case, it will be possible to find such an operator 
whenever 7 lies sufficiently close to the single characteristic surface G which 
lies outside the domain between S and T. 

Together with Lemma 2 this demonstrates the following. 


THEOREM 2. Let k = m —1 and suppose there exists an operator Q(D) 
separating the sheets of P(D) such that all m — 1 characteristic surfaces of Q(D) 
lie between S and T in the region R. Then there exists a solution of Lu = f, with 
given Cauchy daia on S, and with given values for the m — 1 quantities u, uz, 
wg, ..., Hy ™ on T. 


When the coefficients of the differential equation are independent of x, it 
is possible to show that two other sets of m — 1 boundary conditions can 
be reduced to the set just treated. We write Lu = u,“™ + ap," +... 
+ aou, where a, , is a differential operator of order k in D,, D,, with coeffi- 
cients independent of x. 


Coro.iary. Let the coefficients in (1.1) be independent of x. Then there exists 
a solution of the preceding mixed problem when the boundary conditions are 


(a) uz” = 0, h=0,1,...,m —3,m —1, h~Am-—2 
or 
(b) u,™ = 0, kh=1,2,...,m-—l1, h =~ 0. 


To prove (a) let us suppose that v is a solution of this problem in the 
analytic case, and let us show that v = u, + a,_,u, where u is a solution 
of a suitably selected problem with u, u,,...,u;‘"~® vanishing on T. Since 











218 G. F. D. DUFF 


all m — 1 characteristic surfaces between S and T are select, it follows from 
the reduction of Theorem 1 and the uniqueness theorem of (3, §9) that 
the solution is unique. Now let u satisfy Lu = g, where g is a solution, vanish- 
ing on T, of the first order linear partial equation 


a ig=f. 


Since the coefficient of 9g/dx is not zero, and since f is supposed analytic, 
such an analytic solution g exists and is uniquely determined. Since g = Sf ds, 
where the integration is taken along a characteristic curve, we can find L? 
estimates for g and its derivatives if such estimates are given for f. 

Formal calculation, using the non-dependence on x of the coefficients, now 
shows that the combination w = u, + am_.u satisies w= 0, w, = 0,..., 
w,"—) = 0, on 7, while ww,“ = u™ + ante = Gm—guxi™” —.. 
— au + g, which latter expression also vanishes on T by the boundary 
conditions for u and g. 

Now 


= L(uz + am1u) = 2 + ea) Lu = (2 + ows) g=f, 


so that w is an analytic solution of the case (a). Hence, by the uniqueness 
property, »v = w = u, + a,—_1u. Since we have found L* bounds for u and 
its derivatives, it now follows that such bounds can be obtained for »v if one 
degree of differentiability extra is assumed for the non-analytic problem. The 
remainder of the existence proof now follows the conventional methods and 
so is omitted. 

The demonstration of case (b) is similar in principle, but a different device 
is used. We note that the “‘coefficient”’ ao, which is a differential operator of 
order m in the other NV — 1 derivatives, contains just those terms of Lu not 
involving d/dx, and so is regularly hyperbolic in the N — 1 variables. Also, 
the edge C = S(\ T is a spacelike surface relative to ao, so that the Cauchy 
problem aoz = 0, with Cauchy data on C, is a correctly set problem within 


the boundary 7. We will show that a solution v of case (b) is equal to a 
combination 


w= uz") + a,” + ...+ an, 


when certain preliminary reductions have been made. As the non-homogeneous 
boundary conditions corresponding to Theorem 2 can be set up by a substitu- 
tion, we shall consider the problem v, = fi, v, = fe,...02°"-" fm—1 on T, 
with Lv = 0 in R, and, as usual, zero Cauchy data. Subtracting from this a 
suitable solution of the problem with u,‘"-” not given on T, we can suppose 
without loss of generality that f; = fe =... = fm—2 = 0. 

Now let u be that solution of Lu = 0 with u =u, =... = u,“""9 =0 
on 7, with u,“"® = zg on T, where z is the solution of agz = — fm-1 on T. 








@ 


fs) 





MIXED PROBLEMS OF GENERAL ORDER 219 


Straightforward calculation shows that w, defined above, satisfies Lw = 0 
with wy, = w,? =... = w,*-? = 0 on 7, while 


wfP—) ae 42) + ag stg2"-) +... + ats) = — au,” = fr 


on 7. Thus w is an analytic solution of the problem and so is equal to v. Hence 
estimates for u can be applied now to v, provided that m — 1 extra degrees 
of differentiability are assumed for the original problem. This completes the 
reduction of case (b) to the conventional energy integral method. 


9. Case II: Symmetry with respect to 7. The second circumstance in 
which it can be shown that the residual quadratic form can be bounded above 
is when the hyperplane T : x = 0 is a plane of symmetry for the characteristic 
cone of the hyperbolic differential operator. We shall here restrict considera- 
tion to equations of even order, as it is necessary, for the odd order case, to 
make rather lengthy changes in the “analytic’”’ existence theorems to cover 
this situation. 

Thus let Lu be a regularly hyperbolic operator of even order m = 21, of 
which the highest order terms contain D,? but not odd powers of D,, at any 
rate for x = 0. Then 7 is a plane of symmetry as stated above. If now the 
terms of order 2/ are written as in (7.2), it is seen that a,(é,) = 0 for r odd. 
Let us take Q(¢) = 0P/df,; as shown in (8, p. 140) the sheets of the cone 
of Q will separate those of P as required for the formulation of the estimates. 
Thus the odd terms 8,(&,) in Q(¢) will likewise vanish. 

From (7.3) it is seen that in 

min (p.¢) 
Cog = > (bps e+1-s sd bys 0+1-2) 


= 


the sum of indices of the a and 5b coefficients is p + g — 1, which is odd 
whenever p + q is even. It follows that each term will contain a vanishing 
factor when p + gq is even, and therefore that c,, = 0 for p + ¢ even. Thus 
about half the terms in the matrix are zero, including all diagonal terms, all 
terms twice removed from the diagonal, and so on. 

From the symmetry of the characteristic surfaces relative to the boundary 
T, we see that half of the sheets lie in R, and thus / boundary conditions 
are appropriate. 


THEOREM 3. Let Lu =f be a regularly hyperbolic equation of even order 21, such 
that the boundary surface T is a plane of symmetry relative to the characteristic 
cone at each point of T. Then there exists a solution of the C?*+4%\+! mixed 
problem with zero data assigned on T for either 

(a) Ll derivatives of even order: u,u;,...u;@>” or 

(b) 1 derivatives of odd order: u,,u;,...u,;@>”. 


Proof. In case (a), an element c,, belongs to the residual matrix only if 
both p and g are odd, and thus p + gq is even. Hence the residual part of the 











220 G. F. D. DUFF 


quadratic form is identically zero. Similarly, in case (b), the residual part 
contains elements with both p and g even so that c,, vanishes and the quadratic 
form is zero. An application of Lemma 2 completes the proof. 

For the case m = 2] = 2 we can use the Lorentz transformation to show 
that the symmetry requirement can always be satisfied. 


10. Signature of the quadratic form. We conclude with some remarks 
on the algebraic structure of the quadratic form G,({,,f,-). An adaptation of 
GArding’s analysis (5, Lemma 1.1) to the case where complex roots are present 
shows that the signature of this quadratic form is always compatible with 
the number of boundary conditions suggested by the arrangement of charac- 
teristic surfaces. However, the coefficients of this quadratic form are coeffi- 
cients of the variables ¢, dual to ¢ and x’, (9p = 1,..., N — 2), and conse- 
quently the eigenvectors corresponding to the negative eigenvalues of the 
coefficient matrix depend on the £,. As these eigenvectors are linear com- 
binations of the transforms of the x-derivatives, it follows that in general k 
linear conditions (with coefficients depending on the £,) of the transforms 
ai,™ (€,) are required as boundary conditions in order that the residual matrix 
should be bounded above. Upon transformation back to the #t, x* variables, 
these relations would become integral conditions of convolution type on the 
derivatives of u. It seems probable that GAarding’s direct method could be 
modified to include this rather unconventional type of condition. 


11. A remark. I wish to correct a misstatement in the paper (3) on 
first ord2r systems. On page 154, the fourth line from the bottom of the 
page should read “‘Let a non-singular analytic family of surfaces S, fill R in 


such ({ashicu that through each point of R there passes one and only one 
surface S, of the family, and that S; = S,.;.” 


12. Acknowledgments. I am indebted to W. P. Brown for assistance 
with the algebra of Lemma 1. Professors K. O. Friedrichs and A. Robinson 
provided criticism which has led to certain improvements and clarifications, 
and for which I am most grateful. 














_—_— STE 
; 





MIXED PROBLEMS OF GENERAL ORDER 221 


REFERENCES 


1. L. L. Campbell and A. Robinson, Mixed problems for hyperbolic differential equations, Proc. 
Lond. Math. Soc. (3), 5 (1955), 129-47. 

2. G. F. D. Duff, A mixed problem for normal hyperbolic linear partial differential equations of 
second order, Can. J. Math., 9 (1957), 141-60. 

Mixed problems for linear systems, Can. J. Math., 10 (1958), 127-60. 

. L. Garding, Dirichlet’s problem, Math. Scand., 1 (1953), 55-72. 

Solution directe du probléme de Cauchy pour les équations hyperboliques, Proc. Coll. 
Int. du C.N.R.S. LXXI (Paris, 1956), 71-90. 

6. L. Hérmander, On the theory of general partial differential operators, Acta Math., 94 (1955), 
161-248. 

7. M. Kryzyanski and J. Schauder, Quasi-lineare Differentialgleichungen zweiter Ordnung vom 
hyperbolischen Typus, Gemischte Randwertaufgeben, Studia Math., 6 (1936), 152-89. 

8. L. Kuipers and B. Meulenbeld, Symmetric polynomials with non-negative coefficients, Proc. 
Amer. Math. Soc., 6 (1955), 88-93. 

9. J. Leray, Hyperbolic differential equations (Princeton, 1953). 

10. J. L. Lions, Problémes aux limites en théorie des distributions, Acta Math., 94 (1955), 
13-153. 

11. D. E. Littlewood, The theory of group characters (Cambridge, 1939). 

12. V. Thomée, Estimates of the Friedrichs-Lewy type for mixed problems in the theory of linear 
hyperbolic differential equations in two independent variables, Math. Scand., 6 (1957), 
93-113. 


» 








a. 


University of Toronto 











AN IMPROVED RESULT CONCERNING SINGULAR 
MANIFOLDS OF DIFFERENCE POLYNOMIALS 


RICHARD M. COHN 


1. Introduction. Let & be a difference field of characteristic 0, M an 
irreducible manifold of effective order m over R{y}, and F an algebraically 
irreducible difference polynomial in &{y} of effective order » + k, k > 0, 
which vanishes on J. In an earlier paper (2, p. 447) I gave necessary con- 
ditions, restated below as (a), (b), and (c) of the main theorem, for M2 to be 
an essential singular manifold of F. These conditions are analogous to the 
low power criterion of Ritt (1, p. 65) for the corresponding problem of differ- 
ential algebra. Like that criterion they depend, in the special case that I is 
the manifold of y, only on which power products appear effectively in F. 
Unlike the low power criterion, however, conditions (a), (b), and (c) are only 
necessary, not sufficient. I have proved the following results (2, p. 459; 4) 
concerning sufficiency: 

(1) of k = 1, the conditions are never satisfied, so that It is not an essential 
singular manifold of F; 

(2) if k = 2, nm = 0, the condition is both necessary and sufficient; 

(3) if k > 2, the condition is not sufficient, even if n = 0. Moreover, no con- 
dition dependent only on which power products of y and its transforms appear 
effectively in F is sufficient in the special case that I is the manifold of y. 


I shall now show that the restriction » = 0 may be removed from (2). 
Hence, there is a close analogy with the situation in differential algebra 
described by the low power theorem in the case that the effective order of 
the difference polynomial F exceeds by 2 the effective order of the manifold 
M, but only a partial analogy in all other cases. 

The proof for k = 2 and “general’’ m is based on a preparation theorem 
suggested by the preparation theorem used by Ritt for differential polynomials. 
The preparation theorem of difference algebra (restricted to the case that & 
is inversive) consists of the relations (3) and (4) of §5 between F and the 
first polynomial A of the characteristic set of the reflexive prime difference 
ideal with manifold Jt. The conditions (a), (b), and (c) of the main theorem 
are equivalent to the conditions (a) and (8), stated in §8, for (3) and (4). 
These conditions in turn imply that I is an essential singular manifold of F 
in the case k = 2. The proof of this is accomplished by a minor modification 
of the power series method used in (4) for the special case m = 0 and the 
conditions (a), (b), and (c). 


Received July 11, 1958. This investigation was supported in part by a grant from the 
Rutgers University Research Fund. 


222 











—_——_— Or 
> ‘ 


_, i Bean Se 








SINGULAR MANIFOLDS 223 


2. The weight function f(@) of a term 
oy yt... WF", o £0,c¢ € &, 


of a difference polynomial of &{y} is defined to be the polynomial a9 + a,@ + 
... + a,6". The indeterminate @ is called the weight parameter. The weight of 
this term for a value r of the weight parameter is f(r). If an element c whose 
transform is defined to be c’ is substituted formally into a term, then the 
exponent of c in the result is the weight of the term for the value +r of the 
weight parameter. 

Let F and 9 be as in § 1. Denote by a a generic zero of J and by = the 
reflexive prime difference ideal with manifold I. Let A be an algebraically 
irreducible polynomial in = of effective order n—if & is inversive, A is the 
first polynomial of a characteristic set of = or one of its transforms. If P is 
any polynomial of &{y}, then P is to denote the polynomial of ® <a> {sz} 
which is obtained from P by the substitution y = z + a, and P* the poly- 
nomial consisting of the terms of least degree of P. We can now state the 
main theorem. 


THEOREM. In order that It be an essential singular manifold of F it is necessary 
that: 

(a) there exist a term of F which is of lower weight than any other term for 
every positive value r < 1 of the weight parameter, 

(b) there exist a term of F which is of lower weight than any other term for 
every value r > 1 of the weight parameter, 

(c) every solution of F* be a solution of A*. 

These conditions are sufficient if k = 2. 


It only remains to prove sufficiency in the case k = 2. The rest of this 
paper is devoted mainly to this proof. In the last section a method is given 
for testing conditions (a), (b), and (c) constructively if a beginning of a 
characteristic sequence of the ideal = is known. 


3. Proof of a lemma. The first lemma to be proved concerns polynomial 
rings, the second, difference rings. 


Lemma 1. Let & be a field of characteristic 0, 11 a prime ideal in the polynomial 
ring R = Ri[wi,..., Mei X1,..., Xr], the uy forming a parametric set for Tl. 
Let A,,...,A, be a characteristic set for Tl with A, introducing x,. Let F be 
a polynomial of R. Then there exists a polynomial S of R which is not in Tl and 
an integer t such that 


; t 
(1) SF = ) » LARA"... AP", 
t=1 
the L,; being polynomials of R which are not in Il, and the p,,,i = 1,...,t; 


j=1,...,97, constituting t distinct sets of non-negative integers. 











224 RICHARD M. COHN 


Proof. A polynomial of ® is said to be of class k if it effectively involves 
x, but no x;, i > k. The conclusion of the lemma follows trivially if F is free 
of all the x, We shall prove by induction on the class that it is valid for 
other polynomials in the following strengthened form: Let a, denote the 
degree of A, in x;,i1 = 1,...,7r. Then if F is of class k and degree f in x, it 
is possible to find a relation (1) in which S and the L, are free of the x,;, i > k, 
and the power products in the A, which occur on the right-hand side of (1) 
involve no A,,i > k, and are of degree in A, less than or equal to the greatest 
integer h, not exceeding f/a,. 

The strengthened result holds if F is of class 1. For, if h > 0 is the greatest 
power of A, which divides F, we may write F = LA,". This expression is of 
the form (1) and meets the added conditions. 

Let F be of class k > 1, and assume the strengthened conclusion to have 
been proved for all polynomials of lower class. Let f be as before. If f < a, 
we use the expression 


F = Fo t+ Fing +... + Fyxi, 


each F, being free of x;, i > k. For each F;, 0 < i < f, we find an expression 
of the form of (1): 


ti 
@ Soho SE Lath. ate 
j= 


where S,; and the L;, are not in II and are of class less than k. 
Let S be the product of the S;. Substituting from the expressions (2) into 


SF = (S/So)SoFo + (S/S1)SiFixz +... + (S/S) Spi 


and combining terms involving equal power products of the A, we obtain 
an expression of the form (1) for SF. The coefficients L,; of this expression 
are polynomials in x, of degree less than a,, with coefficients free of x;, i > R, 
and not in II. Hence, the L; themselves are not in II. Clearly, S is not in II, 
and A, does not appear in the power products; so that the strengthened 
conclusion is valid for F. 

We now suppose that f > a, and that the strengthened conclusion has 
been demonstrated for all polynomials of class k and degree less than f in 
x,. Applying the division algorithm to F and A;* we find a relation 


JF — MA} = R, 


where J, M, and R are polynomials free of x;, i > k, J is not in TI, M is of 
degree less than f in x,, and R is of degree less than h,a, in x,. By the assump- 
tion made at the beginning of this paragraph there exists a polynomial S; 
not in II and free of x;, i > k, such that S,R is a linear combination of power 
products of A;,..., A, of degree less than A, in A,, the coefficients of these 
power products being polynomials not in II and free of x,;, i > k. By the case 
f. < a previously disposed of there exists a polynomial S, not in II and free 
of x;, 7 > k, such that S,M is a linear combination of power products of 


= 





- 


—_— enn en 


—_—-, 














SINGULAR MANIFOLDS 225 


A,,...,Ax-1, the coefficients of these power products being polynomials not 
in II and free of x,, i> k. 
Let S = S1S2J. In 


SF = S,S,.MA™ + S,S:R 


we substitute the expressions for S,M and S,R just described. There results 
an expression for SF as a linear combination of power products of A,,..., Ax. 
Those power products obtained from S,S,R are of degree less than h, in A, 
and therefore distinct from the power products obtained from 


S,S,.MA™. 


One can verify immediately that the expression for SF has the properties 
prescribed in the strengthened form of the conclusion to Lemma I. 


4. A second lemma. 


Lemma II. Let & be a difference field of characteristic 0, A an algebraically 
irreducible difference polynomial of order and effective order n in the difference 
ring Ri y}, and C (= A), C™, C®,..., a@ characteristic sequence of a non- 
singular component = of {A}. There exist difference polynomials M®, M™, 
M™,..., of orders not exceeding n,n + 1,n + 2,..., respectively, which are 
not in =X, and for which each product C™,M”k=0,1,..., is a linear 
combination of A, Ai,...,Ax with coefficients of order not exceeding n + k, 
while the coefficient of A, is not in >. 


Proof. We choose M® = 1. We suppose M,..., M“-” to have been 
found and demonstrate the existence of M“. 

Since A, has remainder 0 with respect to the chain C®,..., C™ there is 
a relation 

TC = JAy + LOC + ...4 L6-9C@», J¢z. 

Multiplying both sides of this equation by M®...M“-”, replacing each 
MC ,i =0,...,% — 1, by the appropriate linear combination of A,..., 
A,, and putting TM... M@-» = M™ there results 


M®C® = NA +...+ NMA,, 


with VW = JM... M“-» ¢ &. Since the formal partial derivative 0A,/dVp_+» 
is not in 2, it is immediately seen by differentiation of M“C® that M®, 
too, is not in =. This proves Lemma II. 


4 il 


5. The preparation process. Let = be a reflexive prime difference ideal 
of order m in the ring R{y}, where the difference field & is inversive and of 
characteristic 0, and let F € R{y} be of order n + k, k > 0. The following 
theorem provides two expressions for F in terms of the first polynomial A of 
the characteristic set of =. 








226 RICHARD M. COHN 


PREPARATION THEOREM. There exist difference polynomials S, T, of order 
at most n + k, which are not in 2, and positive integers s, t such that 


(3) SF = +» L“° A4?** 42%", ie Ag 
tel 
t 
(4) TP ae >, W*A"*4t"... AP”. 
tml 
Here the p,,, are non-negative integers, no two sets p, ;, Po.y (a # 6,7 = 0,...,k) 


are identical, the q;,, have a similar description, and the L‘® are difference poly- 
nomials of order not exceeding n + k. Those L‘® which are coefficients of terms 
whose power products in A, A;,...,Ax are of least weight for any positive 
value of the weight parameter not exceeding 1 are not in 2. Also the N‘® are 
difference polynomials of order not exceeding n + k, while those N‘® which are 
coefficients of terms whose power products in A, A,..., Ax are of least weight 
for any value of the weight parameter not less than | are not in &. 


6. Proof of (3). Let C® (= A), C™,...,C™ be the first k + 1 poly- 
nomials of a characteristic sequence of £2. They are the characteristic set of 
the prime ideal 


z’ = FAVS [y, 11, - ~~» Varel- 
According to Lemma I there exists a relation 
(5) RF = a PuAcry. a (cmy. 
i=1 
where the a, , have a description similar to that of the p,,, of (3), and R and 
the P‘ are polynomials of R[y, yi, . . . , Yet] which are not in 2’, hence not 
in 2. 


Let polynomials M‘® be chosen in accordance with Lemma II. Putting 
a = max(a,,;), let 0 = R(M® ... M™)*. Then, using (5) and substituting 
for the C‘®M® the linear combinations described in Lemma II, one finds 
a relation 

q 


(3*) az Fa"... ap, Q¢z, 
t=—1 

where the r,,, have a description similar to that of the p,;,, of (3). We shall 

show that those J‘® which are coefficients of terms whose power products 

in the A, are of least weight for any positive value of the weight parameter 

less than 1 are not in &. 

To the ith term of (5) we assign a weight function w,(@) = a,.9 + a4:0 + 
..» +4;,0. We consider a positive value +r < 1 of the weight parameter. 
Then, upon the substitutions prescribed above to obtain (3*), the ith term 
of (5) gives rise to a term 7‘® of the form 


KA gt... Att, 








for 














SINGULAR MANIFOLDS 227 





K ¢ 2, and other terms which, because their power products are formed 
from that of 7‘® by replacing one or more of the A, by lower transforms of 
A, are of greater weight than 7‘. The coefficients of these terms may be 
in 2. 

If, in particular, the ith term of (5) is one of the terms whose power pro- 
ducts are of least weight for the value r of the weight parameter, then no 
term of (5) will yield a term of lower weight than 7‘. Those terms of (5) of 
the same weight as 7“® will yield terms of this weight but with different power 
products. Hence, 7‘ will actually be a term of (3*), and one of least weight 
for the value +r of the weight parameter. Clearly, all terms of (3*) of least 
weight for the value r of the weight parameter arise in the same way as 7‘” 
and, hence, have coefficients which are not in 2. Hence, (3*) has the property 
claimed for (3) in the preparation theorem, except possibly for terms of 
least weight for the value 1 of the weight parameter, that is, for terms of 
least degree in the A ,. 

Because the weight function is continuous, at least one of the terms of 
least degree of (3*) is a term of least weight for a value of the weight para- 
meter less than 1, and hence, has a coefficient which is not in 2%. 

Suppose the term 


JA 740 4 seek 4 068 


is a term of (3*) which is of least degree, but that J‘® € &¥. Following the 
procedure used to obtain (3*) we find an equation 


? 
(6) PJ® = zu HA Ai... As, P¢z. 
j= 
No term of the right-hand side of (6) is free of the A,. For such a term would 
be of least weight for values of the weight parameter less than 1; hence, its 
coefficient would not be in 2. This would yield a contradiction to the fact 
that J € &. 

Let Q’ = QP. Multiplying both sides of (3*) by P, and substituting for 
PJ‘ from (6) we obtain an expression for Q’ F whose terms are those of the 
right-hand side of (3*) multiplied by P, except that the ith term of (3*) has 
been replaced by terms whose power products are multiples of its power 
product by power products of positive degree. Consequently, the terms of 
this expression which are of least weight for values of the weight parameter 
less than 1 have coefficients which are not in 2, and the number of terms 
of least degree with coefficients in = is less than the number of such terms 
in (3*). Continuing the procedure just described, we obtain the equation (3). 


7. Proof of (4). We define the difference field 8’ to be the difference 
field whose elements are those of 8 with the same addition and multiplication 
operations, but with transforming defined to be the inverse of the trans- 
forming operation of &. Let 2 denote the inversive extension of R <a>, 








228 RICHARD M. COHN 


where a denotes a generic zero of 2. We define % to have the same relation 
to 2 as R’ to R. Then & is an extension of &’. 

Let P € &{y} be of order at most m + k. We define P’ to be the polynomial 
of R’{z} obtained by replacing each y,; in P by 2,4,-;. The operation ’ pro- 
duces a one-one correspondence between difference polynomials of order at 
most » + k in R{y} and in R’{z}. In particular, B = (A,)’ is of order n, and 
B,, O< hk <R, is (Ax-n)’. Gaie (where the subscript refers to transforming 
in 2) is a generic zero of a reflexive prime difference ideal 2’ of &’{z} whose 
characteristic set begins with B. The correspondence produced by ’ maps 
the polynomials of = of order not exceeding m + k onto the polynomials 
of >’ of order not exceeding m + k. 

Since (3) has been established, we know that there exists a relation 


t 
4’ T'F’ - N‘?'BO Be . Lal Be‘. 
a) > mE 
meeting requirements corresponding to those imposed on (3). Now (4’) 
yields (4) on application of the inverse of the correspondence produced by ’. 
Clearly, T ¢ 2. It remains only to show that the V“” have the stated property. 
Let 7‘ denote the ith term of (4) and 


w,(0) = Ji, + q:,19 + “*- + Gi 
its weight function. The ith term 7‘®’ of (4’) has the weight function 
v:(0) = gio +... 4+ qin = Ow,(1/8). 


Hence, if 7“ is a term of (4) of least weight for the value r > 1 of the weight 
parameter, then 7‘®’ is a term of least weight of (4’) for the value 1/r < 1 
of the weight parameter. Then N‘®’ ¢ 2’, so N‘® ¢ =. This completes the 
proof of the preparation theorem. 


8. Proof of Equivalence. We now assume that F and A as described in 
§ 5 also satisfy the conditions (a), (b), and (c) stated in the main theorem. 
As in that theorem, F is to be irreducible and vanish on the irreducible 
manifold IN of order m. As in the discussion of the preparation theorem, & is 
assumed to be inversive, and A is chosen to be the first polynomial of the 
characteristic set of the reflexive prime difference ideal = with manifold M. 
In addition, we assume that F is of order and effective order n + k with 
k > 0. It will be shown that there exists a power product U of the A, which 
is of positive degree and is free of A and A, such that 


(a) the right-hand side of (3) contains a term YU, Y ¢ =, which is the term 


of least weight in the A, for each positive value of the weight parameter not greater 
than 1, 


(8) the right-hand side of (4) contains a term ZU, Z ¢ =, which is the term 
of least weight in the A, for each value of the weight parameter not less than 1. 








we 


‘ 








SINGULAR MANIFOLDS 229 


We follow the notation used in the statement of the main theorem. Because 
dA/dy and 0A/dy, are not in Z, A* is a polynomial of first degree which 
effectively contains z and z,. S* and 7* are in R <a>. We find 


@) Fe = DY (L*/S*)(Aty""... (At), 
8) Fe = 3” (N'*/T*)(A*)*... (Ast), 


where 2’ and 2” are taken over those terms of (3) and (4) respectively which 
are of least degree in the A,. It follows from the preparation theorem that 
the L*‘ and N**‘ appearing in these sums are elements of R <a>. 
Suppose that 2’, say, consists of more than one term. Then the homogeneous 
polynomial 
2 (Pf... &'* 


of the difference ring R <a>{u} is not a product of irreducible factors of 
effective order 0, and hence has a solution u = 8, 8 # 0. The polynomial 
A* — Bhasa solution z = y. Then y is a solution of F* but not of A*, contrary 
to condition (c) of the main theorem. Hence, 2’ consists of just one term. 
Similarly, =” consists of just one term. Because the left-hand sides of (7) 
and (8) are identical, and the coefficients on the right-hand sides are in 
R <a>, it is clear that the same power product of the A*, occurs in each 
of these terms. We denote by U the corresponding power product of the A ;. 

Consider a value r < 1 of the weight parameter. Let w, be the weight 
of the term 7‘? = L‘9%A-0, . , 4,?*-* of (3). Upon substituting y = z+ a 
the power product 


aa ae 
yields a term with power product 


of weight r"w, and other terms whose weights are greater, since their power 
products are formed by replacing transforms of z in the indicated term by 
lower transforms. 

If, in particular, T° is a term of (3) of least weight, L‘ is not in 2, so 
that 7 itself will yield a term with the above power product. Terms of 
greater weight than 7‘ must yield only power products of z and its trans- 
forms of weight greater than r"w,. If 


= 5%e™. ..2 j#%, 


is also of weight w, it yields a distinct power product of weight rw, and 
other power products of greater weight. Hence, one power product of weight 
tw, appears in the polynomial F of the main theorem for each term of (3) 
of least weight, and these power products are of least weight in F. Since, by 
hypothesis, there is a unique term of F of least weight, (3) contains a unique 
term of least weight for the value + of the weight parameter. By continuity 


of the weight function this term is the same for all values of the weight 











230 RICHARD M. COHN 


parameter less than 1. It follows at once that it is the term of (3) of least 
degree. 
Let the weight parameter be r > 1. Let the term 


ow. . «ae 
of (4) have weight w,. Upon substituting y = z + a@ into the power product 
PP... | 


there results a term with power product 


” a te 


of weight w, and other terms whose weights are greater since their power 
products are formed by replacing transforms of z in the indicated term by 
higher transforms. It follows, as above, that the term of (4) of least degree 
is the unique term of least weight for all values of the weight parameter 
exceeding 1. 

Not every term on the right-hand side of (3) actually contains A, since 
F has no factors of order nm, and A does not divide S. For sufficiently small 
values of the weight parameter a term free of A is certainly of lower weight 
than a term which contains A. Hence, U is free of A. Using (4) and con- 
sidering large values of the weight parameter, we find that U is free of A,. 
This completes the proof of the statements made at the beginning of this 
section. 

It is easy, but unnecessary for the proof of the main theorem, to show 
that (a) and (8) are equivalent to (a), (b), and (c). Let (a) and (8) hold. 
For positive values less than 1 of the weight parameter, U (for explanation 
of the notation see the paragraph preceding the statement of the main theorem) 
contains a unique term of least weight. It follows from (3) and (a) that this 
term furnishes the unique term of least weight in F. In a similar way it 
follows from (4) and (8) that F contains a unique term of least weight for 
values of the weight parameter exceeding 1. Hence, (a) and (b) hold. From 
either (3) or (4) there results F* = yU*, y © R<a>. Since, clearly, U* is 
a product of powers of transforms of A*, this implies (c). 


9. Completion of the proof. If J? is not an essential singular manifold 
of F there exist, as we shall see, certain formal power series solutions of F 
and its transforms. But we shall also see that such solutions cannot exist 
when k = 2 and the conditions (a) and (8) hold. These facts establish the 
main theorem. Throughout this work we maintain the restrictions of § 8. These 
restrictions are removed in § 14. 


10. Existence of series solutions. We suppose that 9 is not an essential 
singular manifold of F. Then there exists a reflexive prime difference ideal A 
containing F and properly contained in 2. According to Lemma IV of (3), 
A is of effective order greater than n, so that A ¢ A. For any integer r > k 











ast 


uct 








SINGULAR MANIFOLDS 231 


we define =, and A, to be the intersections of 2 and A respectively with the 
ring R, = Riy,..., +r]. Let G=R<a>. A, generates an ideal in 
@ [y, ...,¥n+r] whose radical is the intersection of prime components at 
least one of which admits the solution y,; = a, (¢ = 0,1,..., n-+r). Let 
A,’ be such a component. Since A,’ and A, have the same dimension (5, 
vol. 2, p. 69), A,’ (\ 8, = A,. Hence, A...A,¢A,’. Then (4, p. 526) A,’ 
admits a solution not annulling A... A, 


(9) Vi = a, + gi(h), t=0,1,...,"+4,7, 


where &é is transcendental over @ and the g,(h) are formal series in positive 
integral powers of h with coefficients algebraic over @. 


11. Non-existence of series solutions. We now suppose that k = 2, 
and that the conditions (a) and (8) hold. We assume the existence of the 
solutions (9) and obtain a contradiction if r is sufficiently large. 

If the series (9) is substituted into a polynomial P of &, there results a 
series in non-negative integral powers of h. The term of zero degree in this 
series results from the substitution of the a, into P and, hence, is 0 if and 
only if P € =. In particular, the series obtained from A, Ai,..., A, are not 
0 but begin with terms of positive degree. We denote the series obtained from 
A, by 
(10) kh) =~ ah**+..., eee 


where the a, are algebraic over © and not 0, and the s, are positive. Sub- 
stitution of (9) into F,..., F,-. (= Fy~2) gives 0, while substitution into S, 
T, or the coefficient of the term of least degree on the right-hand side of (3) 
or of (4) results in a series whose term of zero degree is not 0. 


12. We consider the power product U of the A, described in § 8. Since 
k = 2, U = A;‘, d > 0. The numbering of the terms of the right-hand sides 
of (3) and (4) is to be chosen so that those terms whose power products are 
of degree less than d in A, precede the remaining terms, and the term with 
power product U is last. Let s’ and ¢’ denote the number of terms on the 
right-hand sides of (3) and (4) respectively whose power products are of 
degree less than d in A,. Since not every term on the right-hand side of (3) 
or of (4) has the factor Ay, 1 <s’ <s,1<t <t. 
Let 
(11) D:(8) = Pio + (Pia — dO + pi 26", l< t < 
gi(9) = Gio + (Gin — dO + 9:28", l<qic<t 


Since the power product A,‘ is of lower weight than the other power pro- 
ducts on the right-hand side of (3) for positive values of the weight parameter 
not exceeding 1, p,(@) > 0,0 <6 < 1. Since p,; —d < 0,1 <i< ss’, it follows 
that p;0 > 0 for these i. Then each p,(@) is bounded away from 0 on the 
interval 0 < 6 < 1. Similarly, g,(@) > 0, 6 > 1, whence it follows that the 


J 
3, 
, 








232 RICHARD M. COHN 


q:(@) are bounded away from 0 on this interval, and that the g,;2,1 <i < ?, 
are all positive. Let m > 0 be such that 


p.(0) > m, 0<é6<1, l<i<s’, 


(12) q:(0) > m, 6@>1, l<i<?. 


We define a to be the maximum of the quotients (d — 9;1)/ai2,1 <i< ?, 
and b to be the maximum of the p; 2,1 < i < s’,g42,1 << i<t. Thena, db > 0. 
Let c = m/ab, d = a/c. We shall obtain a contradiction from (9) with 


r>d+2. 


13. Upon substituting the series (9) into the right-hand side of (3) the 
result must be zero. This can only be so if the power product of some term 
of (3) other than the last term produces an expansion beginning with a power 
of A not higher than that with which the expansion of the last term begins. 
Since the last term is YA,*, Y ¢ 2, this means that there is an integer jp such 
that 


Pin.080 + (Pye. — @)Si + Pyo,282 < 0. 
From the definition of s’ it is clear that 1 < jo < s’. By applying similar 
reasoning to the right-hand side of (4) and to the transforms of orders not 


exceeding r — 2 of the right-hand sides of (3) and of (4) it follows that 
there exist integers j;,k;,0 < i<r — 2, such that 


l<jii<s’, lock, <?; 


(13) P4081 + (Pia — DSi + Py S42 < 0; 
x; 0S + (Qe; .1 — d)Si41 + Ue; 25142 < 0. 
Let t; = Sui/s; > 0, (6 = 0,...,7 — 1). Then (13) yields, for O<i< 
r —2, 
(14) Pio + (Ps.1 — Ati + Py, ttiar < 0; 


Qx;.0 + (Guz. — Dts + gettin < 0. 
It foliows from (12) that for each i, 0 <i<r — 2, either 0 < t, < 1, and 
Pyo + (Py — Dti + Py te > m; 
or t; > 1, and 
Grs.0 + (Gueg.1 — Dts + Qu; .2t? > m. 
From whichever of these inequalities is applicable it follows by subtraction 
of the corresponding inequality of (14) that either 
Di; ot i(ts — tins) > m, 
or 
Qu; 2ti(ts — trai) > m. 


In either case 


(15) ti(ts — tuys) > m/d, 0<i<sr —2. 














SINGULAR MANIFOLDS 233 


Now (14) yields 


(Qe, .1 — d) + Qe, .2tias < 0, O0O<i<gcr-2 
since, for every i, 
dk; 0 > 0. 
Thus, 
(16) tina < (d — Qei.1)/Qei.2 < @, 0O<i<r-2. 


From (15) and (16) there results ¢; < a,t; — tui > m/ba = c,1 Ci<cr —2. 
Hence, t,.1 < a — (r — 2)¢c < a — dc = 0. This is the desired contradiction. 


14. Removal of the restrictions. It remains only to prove the main 
theorem without the restrictions of §9. Let ® be the inversive extension 
of &, C the polynomial of order n in &{y} and G the polynomial of order n + k 
in &{y} whose transforms of the appropriate orders are A and F respectively. 
Let M’ be the irreducible manifold over R{y} with generic zero a, and 2’ the 
reflexive prime difference ideal of ®{y} with manifold M’. Then M’ is of order 
n, each irreducible factor of G is of order and effective order n + k, and 
GE 2’. 

Let H be an irreducible factor of G which is in 2’. Then, in the notation 
of the main theorem, the polynomial consisting of the terms of A of least 
weight for some value of the weight parameter is a factor of the polynomial 
consisting of the terms of G of least weight for this value of the weight para- 
meter. The latter polynomial is an inverse transform of the polynomial con- 
sisting of the terms of F of least weight. 

The first polynomial D of a characteristic set of 2’ divides C. Let C = PD. 
Since @ is not a solution of 8C/dy,, P ¢ z’. Then D* = yC*, y € R<a>, 
y ~ 0; and C* is an inverse transform of A*. 

The preceding statements show that 9’ and H satisfy the conditions (a), 
(b), and (c), so that M’ is an essential singular manifold of H according to 
the restricted case of the main theorem. Hence, there is a polynomial 
Q € Riy} such that Q ¢ 2’, and, if E € 2’, QE vanishes on the manifold of H. 

To each irreducible factor of G there corresponds a polynomial with the 
properties of Q. For this has just been proved for factors which vanish on M’, 
and it is evident for other factors. Let R be the product of these polynomials. 
Then R ¢ >’, and, if E € 2’, RE vanishes on the manifold of G. Some trans- 
form S of R is in R{y}. Since = < 2B’, S ¢ B, and, if E € 2, SE vanishes on 
the manifold of F. This proves that 2% is a component of { F}, and, indeed, 
since its effective order is less than that of F, an essential singular manifold 
of F. The proof of the main theorem is complete. 


15. Constructive methods. It is possible to determine by actual con- 
struction whether or not conditions (a), (b), and (c) are satisfied, provided 
one knows the first k polynomials of a characteristic sequence of 2. (For the 











234 RICHARD M. COHN 


meaning to be given to “characteristic sequence” if Z is not of equal order 
and effective order, see (2, footnote 7).) In fact, it was shown in (2, p. 447) 
that one can determine constructively whether or not (a) and (b) hold. But 
if (a) and (b) hold, (c) is true if and only if F* is a product of powers of 
transforms (including, possibly, inverse transforms) of A* and a factor in 
R <a>. For, on the one hand, this condition clearly implies (c). On the other 
hand, if (a), (b), and (c) hold it follows from (a) and (8) under the con- 
ditions of the restricted form of the main theorem, and from this special 
case and the reasoning of § 14 in the general case, that F* is such a product. 
There is no difficulty in determining constructively whether or not F* is a 
product of this type. It follows, in particular, that, if k = 2, one can deter- 
mine constructively, under the stated limitation, whether or not QM is an 
essential singular manifold of F. 


REFERENCES 


1. J. F. Ritt, Differential algebra, Coll. publ. Amer. Math. Soc., 33. 


2. R. M. Cohn, Singular manifolds of difference polynomials, Ann. Math., 58 (1951), 445-63. 


3. ——— Extensions of difference fields, Amer. J. Math., 74 (1952), 507-30. 
4. 





Rutgers University 





Essential singular manifolds of difference polynomials, Ann. Math., 57 (1953), 524-30. 
5. W. V. D. Hodge and D. Pedoe, Methods of algebraic geometry (Cambridge University Press). 














SUBSPACES OF A GENERALIZED METRIC SPACE 
H. A. ELIOPOULOS 


Introduction. In a paper published in 1956, Rund (4) developed the 
differential geometry of a hypersurface of m — 1 dimensions imbedded in a 
Finsler space of m dimensions, considered as locally Minkowskian. 

The purpose of the present paper is to provide an extension of the results 
of (4) and thus develop a theory for the case of m-dimensional subspaces 
imbedded in a generalized (Finsler) metric space. 

We consider an n-dimensional differentiable manifold X, and we restrict 
our attention to a suitably chosen co-ordinate neighbourhood of X, in which 
a co-ordinate system x‘ (4 = 1,2,...,), is defined. A system of equations 
of the type x‘ = x‘(t) defines a curve C of X,, the tangent vector dx‘/dt of 
which is denoted by <‘. We say that the manifold X, is endowed with a 
locally Minkowskian (Finsler) metric, if the length of an arc of the curve 
C between two points P; and P; of C, corresponding to parameter values 
t, and te, is defined by an integral of the type 


J Fe x*)dt, 


where the function F(x‘, <*‘) is continuous and continuously differentiable up 
to any required order in all its arguments, and also positively homogeneous 
of the first degree in the x‘. 
Defining the metric tensor of X, by 
: °F’ (x, %) ; ' 
bis(x,%) = 4— as, #) Bax, %) = Or, 
we can put 
F*(x, ) = guj(x, 2)2‘x’; 
F must satisfy a third condition, 
gis(x, )E°E’ > 0, 


for all x‘ and all &‘, provided not all £‘ are equal to zero. 
From Euler’s theorem on homogeneous functions we have 


8g.,(x, x) . 8°g.4(x, x) . 


Received April 9, 1958. The present paper is based on a thesis submitted at the University 
of Toronto for the degree of Doctor of Philosophy. The author wishes to express his sincere 
appreciation to Professor H. Rund for direction and advice in the course of this investigation, 
and to Professor G. F. D. Duff for valuable comments. The National Research Council of 
Canada supported the research by a fellowship. 


235 











236 H. A. ELIOPOULOS 


We also define the generalized Christoffel symbols of the first and second 
kind by the relations 


7 » a" (x, #)[hk, flies, 
(hk, jliz.2) = 4 ( 2eeals, 8) + Sea le %) — Seals ®)) 


Let C be a continuous and continuously differentiable curve. At each point 
P of C, with co-ordinates x*, a Minkowskian tangent space 7,(P) is defined 
by F(x*, x*). We consider an arbitrary vector field X‘(x*) along C such that 
in each 7,(P) a vector X‘ is defined. Let Q be a neighbouring point with 
co-ordinates x* + dx* on C, such that the arc length PQ = ds. The covariant 
differential DX‘ of X‘ at P for the transition from P to Q is then defined by 


(A.3) DX‘ = (<= + Piles nx" )aat 


where 





Ph(x, x’) = e. —_ 4 g(x, x ") Stanls {tf x’? 


(z,2") 
and x’t = dx/ds. 
We note that (A.3) depends only on the vector X‘ and the displacement 
PQ for which it has been defined, and not on the curve C passing through 


P and Q. On the other hand, the covariant derivative of X‘ with respect 
to x* is given by 


ax' ‘ , 
(A.4) Xi = SF + Pri(x, x’)x", 
where (5) 
- 0 rs) 0 , 
Pos = guP) = (ij, k] — 5 (2804 ps + 2 Ps aed 284 pi.) x 


Consider a continuous curve C of X,, which lies in some two-dimensional sub- 
space X, of X,, and let the parameters of X, be u and v. The parametric curves 
u = const. and v = const. may cut C in an arbitrary manner. Two directions 


are defined at each point of C, and they represent the directions of the tangents 
to the co-ordinate curves. Then, for a vector field X‘(x*), we have in the Xq, 


Dx‘ Dx* ‘_k 
“Du = yt Do = ADs 
and thus, we obtain the commutation formula (6), 
= oe 


(A.6) 





_ t t n,m 1 nm on ym 
DoDu DuDv (X sam — X smn)&E + X in (Em 1m ). 














id 


nt 
gh 
ct 


ib- 
ves 
ns 


nts 
Xs, 





ee ree 


So ee 








SUBSPACES OF A METRIC SPACE 237 


If we use the relation 


e. 
a 


=|. 


we reduce (A.6) to 
D* ‘ D’*x' ‘ ‘ —_ 
DoDu — DuDo ~ “~a= — Xma)é a - 
Introducing the expression 
OPin _ OP im 
ax™ ax" 
(2 eS ax’! aPrs =) 
—_——-— -_ im —_——— 
+ ax’! ax™ ax!" Ox" } (e.2")' 
which we call the relative curvature tensor in view of the derivative dx''/ ax" 
which appears in it, we may obtain the commutation relation 
Xun = p = K SaeX”. 


We also define a covariant curvature tensor from the relation 





(A.7) K‘ymn(x, x’) = 








+ PuPhs — Pras 


K tamn(x, x”) = giy(x, x”) Kimn (x, x”); 


then, if Y,(x*) are the covariant components of the vector field, we may 
obtain the relation 


(A.8) Yime no Y to = Kn Yn. 

1. Generalities. Consider a differentiable subspace of m dimensions F,,, 
imbedded in a locally Minkowskian (Finsler) space F,, where m < n. Let 
(1.1) x‘ = x‘(u*), (¢=1...8,a=1...m), 


be the equations defining F,,. We assume that the Jacobian matrix 
:) 
(Xa) (a 
is of rank m. 


If the co-ordinate curves are regarded as curves of the F,,, then their 
tangents are given by 
Xe ou" 
and at each point P of F,, we have m independent vectors dx‘/du*, which 
will span an m-dimensional plane 7,,(P) C T,(P), where by 7,,(P) we mean 
the m-dimensional linear space tangent to F,, at P. 

A vector X‘ lies in F,, if X*‘ € T,,(P), which implies that it is of the form 


(1.2) X‘ = U*—. 











238 H. A. ELIOPOULOS 


F,, will be endowed with an induced metric 
ds* = gag(u, u")du"du’ 


with fundamental tensor given by 


(1.3) (u, u’) = (x, x”) ax" ox" 
° Zap ’ _ £i3\ ’ u* ow ’ 


where the tangent u’* to F,,, satisfies the relation 
(1.4) x’ = XSu'". 


In general, we have to consider two sets of normals to F,, at a given point 
P of F,,. The first set is defined by the solutions n‘ of the equations 


(1.5) nm Xo = g1,(x, n)n’X = 0. 
These solutions are normalized by means of the relation 
(1.6) F(x,”) =1 or gi;(x,n)n‘'n’ = 1. 


Since the matrix (X,‘) is of rank m, we have n — m independent solutions 
and, therefore, » — m independent normal vectors. They span a vector space 
at P, and any vector of this space will be a linear combination of the inde- 
pendent vectors spanning the space. 

We may define a different set of normals in the following way. Let x’‘ be 
an arbitrary but fixed direction tangential to F,, at P. A second set of normals 
can be defined by the solutions m*(x, x’) of the equations 


(1.7) £13(x, x’ )n**(x, x’)Xi = 0. 


The matrix (X,*) being of rank m, the system (1.7) admits m — m independent 
solutions of the direction considered. We may write 
Nu) = Ny(x, x’), (u=1...n—m). 

To each direction x’ tangent to F,, at P corresponds a set of vectors n**;,) 
(x, x’), and the totality of these sets, for the different x’ at P, defines m — m 
cones which are the normal cones of the subspace F,, at a given point. We 
must emphasize that the generators of the normal cones do not necessarily 
lie in the space spanned by the normals m at the same point. The concept 
of the normal cones for subspaces is an extension of the idea of a normal 
cone of a hypersurface F,_,; (4). 

We assume as in the case of the n(x), that m*(x, x’) are normalized according 
to the relation 


(1.8) F(x, n* (x, x’)) = £13(x, n* (x, x’))n* *(x, x)n**(x, x’) = 1. 
We may also define » — m tensors, independent of direction, 
Yuras(u) = g4;(x, Ny) )XaXb, 


for the » — m normals at P. Then we define the following sets of inverse 
projection parameters corresponding to X,’: 








it 





a 





SUBSPACES OF A METRIC SPACE 239 


(1.9) tx, x’) = gey(x, x’)g™*(u, u’)X4, 
Vin (x) = gay(x, mu) Ve (u)X4, 


so that, in view of the equations (1.5), (1.7), we have 


(1.10) ne X% = 0, Vou) Mey) = 0, 
and also 
(1.10a) X{X5 = 55, Yin iXp = 55. 


It is always possible to choose a set of m — m orthogonal independent vectors 
n* (x, x’). Indeed, for any vector of the space spanned by the n**,,,) we have 


N* (x, x’) = pas AwMen (x, x), (u=1l...m—m). 
(*) 


Let us consider a set of m — m such vectors; we can write down the » — m 
relations 
NA (x, x’) = > Ac») uy on (X, x’), (vy,wo=l...m—m). 
(*) 
In order that N**‘,) should be orthogonal (with respect to g,, (x, x’)) the 
functions \,,)(4) must satisfy the relations 
(1.11) £15(x, x’ NON = > _ £15(x, x”) cw uy Acey (Men) (X, x’ ned (x, x’) 


(*) (*) 


= 6(v)(c). 
If we put 
(1.12) T wuycay(%, x”) = gag(x, x’) ng sncd, (usx= 1..." —m), 
the equation (1.11) can be written 
(1.13) = bi Tw @rAmwAC«@ = 0, for vy # o. 
(#) («) 


Our problem reduces to finding » — m sets of functions \,,);) satisfying the 
equations (1.13). 


It is known that, if in a projective (m — 1)-dimensional space we introduce 
homogeneous co-ordinates, the equation of a hyperquadric has the form 
(1.13a) Ay (2,2, = 0, 


and the co-ordinates x,, y, of two points harmonically conjugate with respect 
to (1.13a) satisfy the relation 


Aiki = 0, 


(see (1) for the 2-dimensional case). The problem of finding sets of functions 
Aiww(») Satisfying (1.13), is equivalent to the problem of finding the vertices 
of polyhedra self-polar with respect to 

a Tw) (@AwACo = 0. 


(#)(«) 


One vertex P; of such a polyhedron can be chosen arbitrarily in the space, but 
not on the quadric; a second vertex P:, arbitrarily in the polar hyperplane 











240 H. A. ELIOPOULOS 


of P;, but not on the quadric; a third vertex P;, arbitrarily on the intersection 
of the polar planes of P:, P:, but not on the quadric, P, on the intersection 
of the polar planes of P;, P2, P3, and so on. The last one will be on the inter- 
section of the polar hyperplanes of all the previous points. Since P;, P2,..., 
P,-; can be chosen with » — m — 1,n — m — 2,...,1 degrees of freedom 
respectively, there are 


(n —m—1)+ (n —m—2)+...+1 = $(n — m)(n — m — 1) 


degrees of freedom in choosing the » — m sets of functions A. 

The induced covariant derivative of the vector X‘ can be defined just as 
for a hypersurface (4). Let x‘ = x‘(s) be a curve C of F,, so that x’‘ is tangent 
to F,,. We consider a continuous and continuously differentiable vector field 
tangent to F,,: 


(1.16) X‘(x") = XLU%(u"). 

The induced covariant derivative of the vector field along C in the space 
F,,, that is, the tensor defined by 

(1.17) U*,(u, u’) = au = + Pt (u, w’)U 


is given by projection onto F,, of the covariant derivative X ,‘ of X‘ with 
respect to F,, 


(1.18) Bis(x, x’)XEXEX', = goy(u, u’)U%, 
where 

i 
(1.19) xXi= oa, + Pr! (x, x”)X". 


One can prove easily that 


24 
x 


Bis(x, x") XY (.% *aP + Puxix3) = Pha.x(u, u’), 





with 
* 
Pa,y(t, u’) = LeyP ba- 
It is obvious that P*,, are symmetric in the lower indices, because P*‘,, are 
symmetric. 


It is very easy to show that the quantities (1.17) form the components of a 
tensor, in the sense indicated by their indices, under a transformation of the 
co-ordinates u* of F,, Eliopoulos (3). 


Since the subspace F,, is endowed with a metric tensor gag(u, u’), we can 
write immediately the Euler-Lagrange equations for the geodesics of that 


space 
du {2 \ du? du’ 
as + By) (u.u’) ds ds - 0, 














SUBSPACES OF A METRIC SPACE 241 


tot 
By (u,w’) 


are the intrinsic Christoffel _— We may also write 


where 





du’ du? du” 
me ag ds - [By, 5)iu, wu’) ads ‘ds = 0, 
or 
a Priuu” = 0. 
We immediately see that 
bu’* 
és -s 


along a geodesic, that is, the geodesics are autoparallel curves. 


2. Normal curvatures of F,,. We consider a curve C of Fp, x; = x;(s), 
passing through a given point P. We take the parameter s to be the arc-length, 
and the unit tangent vector to C at P will be denoted by x’‘. Let us assume, 
for the moment, that the vector field U* of equations (1.16) coincides with 
the tangent vectors u’* of C. If we denote covariant differentiations in F,, 
by 4, we obtain 


bu’* _ oa *a Bory du’ ‘° } Bory 

(2.1) $s + PRE(w, uu ul = ds + By uu". 
By using the on of Dx’t/Ds and differentiating x’ = X,‘u’* we find 

Dx'* a*x* ya 8 eee { p*a “a lal ‘ *,. ‘al 

Ds” = au"au u” + Xe — Xe Peru + Prix 
If we put 

a°x' 
(2.3) Xes = 525,38 ~ XrPab + PuXeXs, 
we may write 
Dx'* ya. 78 ee 

(2.4) Ds = Xigut“u"” +X! 


The expressions X,* which are the components of a tensor, may be considered 
as the generalized covariant derivatives of the X,‘ with respect to wu’, in the 
sense used in (4, 7). We note that X.s' are symmetric with respect to the 
lower indices. 

The Xa‘ can be given the following geometric interpretation: We consider the 
geodesic C, of the space F, through the point P, tangent to the given direction 
x’*. Let #' = £'(s) be the equations of C. We also consider a geodesic C of 











242 H. A. ELIOPOULOS 


the space F,, through the same point P, and tangent to x’‘. Let #‘ = #*(s) 
be its equations. We choose two points one on C€ and the other on C corre- 
sponding to the same value of s, and in the neighbourhood of P. The co- 
ordinates of these points can be expended in Taylor series, for small values 
of s, so that 


#‘(s) = 2p + Zp's + 4247's? +... 
#'(s) = Zp + Fp's + 42¢'s? +...., 


where by Zp, %p, etc., we mean the values of these functions at the point P. 
Then 


ff =z — gf = 4(2""* — 2'"")s* + O(s°) 


because ¥p‘ = Zp‘ and #p’‘ = Zp‘. From the equations of geodesics we have 
for C 

dz’* ‘' shork 

ds hk “a -* 


Dz’* = dz’ i; \ z'*z’* 


Also for C, we have 
and therefore 


In view of (2.4) applied to a geodesic of F,, we obtain 
2¢* 


Xi(u, u’)u’*u” = lim SS. 


30 


We consider the formulae (1.5) and (1.7). Since m,,), and n*,,), are solutions 
of the same linear equations, we may write 


* 
(2.8) Mins = DL Putin 
(r) 
multiplying the above equations by m,)‘, and since m,)‘ are unit vectors, we 
obtain 
* f 
y Purl») My) = 1. 


(*) 
The equations (2.8) can also be written as 
£is(X, My) Nia) = > Park is(x, x” mcs) 
and if we multiply by *.)i we find 
815(x, Nip Mi MO) - p> PrurB s(x, x’ \nthno) = Pur, 
since 
‘ 


*% 
£1j(x, x’) mijn) = dvr 

















SUBSPACES OF A METRIC SPACE 243 


(no summation over \ involved). The above relation may be written 


: 
cos(m i), 2a) = Pavr, 
or 


* 
cos (ny), % 
(2.9) Pu = ( (*) a) ; 
Vr 
since the cosine of the angle of two vectors m,), m*) is defined by 
* 
g(X, Mw) Mima 
‘ * *¢ #5’ 
[ges(x, iy) (ue }*Lg «s(x, nary Monon) 
and m,), m*q) are unit vectors. 
We now prove the following theorems. 


* 
COS (My), May) = 





THEOREM I. The principal normal of a geodesic G of F,, lies in the space 
spanned by the secondary normals n*. 


Proof. We multiply the relation (1.18) by u*, obtaining 





Dx''* bu’* 
BX} Do = ber 5 


which is satisfied by the tangent vector u’* to any curve C in F,,. For a geodesic 
G we have éu’*/is = 0, hence 


Dx’ 
on) oe 
81;(x x’) 7 Ds on 


Since the vector Dx’‘/Ds, which defines the principal normal to the geodesics 


G, satisfies the equation (1.7), it belongs in the space spanned by n*,,), there- 
fore 


Dx'* *1 
2.10 (Be = Aw)M()> 
( ) Ds w p> (#)"*(m) 
where n*‘,) is a set of m — m orthogonal independent vectors of that space. 


THEOREM II. The tensor Xs‘ considered as a function of a given line element 
(x‘, x’) lies in the space spanned by the secondary normals n*. 


Proof. We consider the equations 
X= XU", gyXIX2X's = BU, 
then we can write 
a ‘ 
£1sX1X5P os = £15X4 (fe, + Pux2X3) 
and because of (2.3) we obtain 


(2.11) £15(x, x’) XiXs5(u, u’) = 0, 


which proves the theorem. 














244 H. A. ELIOPOULOS 


The vector X.‘ (in i) will be a linear combination of the m* and therefore 


(2.12) Xion = DS Mas (u, u’)nG5; 


(*) 
multiplying the relation (2.12) by m:,)‘ and putting 
>» Qrap COS (11), Mwy) = Despap, 
oe 
we find 


(2.13) 1») sX2p = Qras- 


It is obvious that Q,,,2g are tensors symmetric in a, 8. 

The relations (2.12) and (2.13) are fundamental for the whole theory of 
subspaces of a Finsler space. 

We consider the relation (2.13) and we multiply by u’*u’*, then 


24 
(2.14) Q rapt” u”” -_ aces] oe uu” + Pass’ | : 
but 
rt i s 
(2.15) Oe yin” 


Mo) Gs = ®* Fu ou? u’ 


Therefore, combining the above equation with (2.14), we obtain 
, dx’* a4 D st 
(2.16) Qirragtt”u 6 = ose + Prix * | = Nir) = ° 


We can easily see that this is the same for all curves of F,, with tangent 
vector x’‘, but depends on the choice of (x, x’), as in classical differential 
geometry. Indeed, differentiating the relation 


we find 


and since Dn,/Ds x'‘ depends on x, x’ only, so does the right-hand side. 
From the identity 











‘ Dx''* - Dx’ ( Be) 
(rv) Ds _ Ds COS \ % i»), Ds 
we obtain 
rt ro, 
(2.17) Dx") A _ Mereglu? __ 
| Ds Pe COS (m4), Dx’ /Ds) 





where p, is the radius of curvature of the curve regarded as a curve of F,. 
The relation (2.17) may also be written 


(2.18) cos (m:,), Dx’ /Ds) - 
Pc 





8 
Qrraptt”u™, 








~~ 

















SUBSPACES OF A METRIC SPACE 245 


and since 


Qryoptt’ a” 


is the same for all curves of F,, tangent to x’‘, we obtain Meusnier’s theorem 
of classical differential geometry. We may therefore regard 


l 
Qrastu” = 
(rv) af Rw) 


as the normal curvature corresponding to the normal n,,,‘. It is obvious from 
(2.17) that the ratio 





Q rapt”? 


cos (m»), Dx’ /Ds) 
is independent of the choice of m,)‘. 
The concept of the principal direction of a hypersurface F,_; can be extended 


to any subspace F,,. Indeed, we have shown that to each direction at a point 
P of F,, correspond » — m normal curvatures 


Qas(u, u’)du“dul 
Lap (tt, u’)du“du* 








(Riu (u, u’))* = 


associated with the given direction wu’. 
If we put 
(2.19) Qyag(u, u’)dutdu® = 1, 


we obtain a number of ” — m loci, of m — 1 dimensions each, on the hyper- 
plane spanned by X,‘, in the Minkowskian tangent space to F,,, at the given 
point. The principal directions will be given by the extreme values of gas 
(u, u’)u’*u’® subject to the conditions (2.19), where u* is kept fixed. In other 
words, principal directions are directions for which the normal curvatures 
assume extreme values. According to the multiplier rule we must seek solutions 
of the equations 


Ce] , ase , a 
Jit (eat, 1’)u"*u”® + (200 (u, 14" )re’*u’” — 1)] = 0, 


which, after performing the differentiations and using Euler’s theorem for 
homogeneous functions, may be written 


a , a da ’ ae 
(2.20) 2 Zay(u, ’)u"™ + 2Q.,)(u, u’)u™™ + d oui? uu" = 0. 
The equations (2.20) are of the same type as the corresponding equations 
for the principal directions of a hypersurface F,_, (4). Applying the same 
algebraic algorithm, we obtain the following eigenvalue equations: 


(2.21) Lay(u, u’)u'® = Ry, (u, u’)Qyay(u, u’)u’, 


where (R,,)(u, u’))—' is the normal curvature corresponding to a solution of 
(2.21). This is a non-linear eigenvalue problem with eigenvalue R,~' and little 
can be said about the number of possible solutions. 











246 H. A. ELIOPOULOS 


Let us assume that at least two independent solutions u,,);'*, %;:,)2’* corre- 
sponding to two distinct normal curvatures 1/R,,)1, 1/Ri)2 exist. Then, from 
(2.21) we obtain 

, ra Y ’ 1a ” 
Bary (4, Upp) Meu 1 (ry2 = Ry) Quay (us, Uy) 1) Uy) 14 (>)2 

, a 7 , sa al 
Bary (%, U(»)2) U(r) tein. = Ri 2Qryay(U, Ui »y2) Ui) eG 1» 


subtracting, we find 





(2 22) cos (ui, tu »)2) — COs (14,2, té(»)1) 








Rw R2 
7 7 
at | Sere ens) _ | Senet Hos ena | 
Rw)2 Rw 


When we refer to the same normal m,,)‘, the above formula becomes 





2.23 cos (uty, u(2)) — cos (u‘2), u‘1)) 
(2.23) Rauuk 
(vy) 14\(»)2 


, , 
Qi. ray (U4, ui‘) Qe ay(%, uu) ) pa ry 
= a om Uy, Ue. 
Rw Rwy2 


The equation (2.23) is a generalization of the orthogonality relation between 
principal directions of surfaces in classical differential geometry. Indeed, in 
a locally Euclidean space, the cosine of the angle of two directions is a sym- 
metric function of them. Therefore the left-hand side of (2.23) vanishes and 
we obtain 








(2 24) Qa, (U4, Uy )uyus” = Qs) ay(, U2) ui us" =0 
, Rw Rwy ; 
But (2.21) provides 
ary 
ay - = Qi» ay(U, u’ )uyus’, 


and thus equation (2.24) becomes 


1 1 ) 
A , a on 0. 
a (ui a) (ad Rw2 


Since R,,): # Ry»: we obtain cos (u;', wu’) = 0, which demonstrates the 
orthogonality of u,’, 12’. 

We can also define a secondary normal curvature associated to a line 
element x, x’ and depending on 2*. For that purpose we consider the relation 


Dx''* bu’* 
(2.25) — > Auten + Xe ot 
for an arbitrary curve of F, and we multiply it by m,);, then 
Dx''* 


. 
Nir) i “Ds _ Z. Aw) COS (ni, Niy)) 


and 





Oe or ee - 








Se 


epee See Oo 








SUBSPACES OF A METRIC SPACE 247 


Dx’ N»), Dx’*/D 
cos (nen, De) = "Dz an : -— & > A(y)COS (m»), Niw))- 
Or, because of (2.17) 
QXraptt"u" = > AwCOS (Mc), My»), 


and hence, in view of (2.12a), 


sa, 


Aw = Myastt’“u”., 
From (2.25) we obtain 
Dx’ oa — eo , ou’ 
— p> (Qipraptt”u” ny) + Xe s 
and for a geodesic 


*1i 


* ya, #8 
= (Quast uni). 
Ds (s) 


We define the secondary normal curvature to be 
1 , * * va_ 8 78 
Bet = Bs ®) Do De = De Werle, 2 Mrase yar! uuu™. 
( 


In the way 1/R* is defined we see that it is independent of the particular 
set of normals n*,,). 


Let us consider the biquadratic form in the differentials, 
= Dy Ww (x, x) Mires Mh.yyedu"du’du”du’, 
(*) 


we may call it the secondary second fundamental form of F,,. Generalizing 
the concepts of conjugate and asymptotic directions of a surface in classical 
differential geometry, we may say that two directions at a point defined 
by du* and 6u* are conjugate when 


DS Yas yrdu"idu'su’ = 0, 
(*) 

and asymptotic or self-conjugate when 
LD HM naMnyaduTdu'du'du’ = 0. 
” 


From the above relation and the one defining the secondary normal curva- 
ture we conclude that the secondary normal curvature in an asymptotic direction 
is always equal to zero as in Riemannian geometry (2). 


3. Covariant derivatives of the normal vectors n*, n. We define the 
tensor n**‘.,) ¢, covariant derivative of the vector n**‘,,), by projecting m*‘(,) x 
onto F,,: 


*1 *1 k 
(3.1) Nin) p = My) aXp. 











248 H. A. ELIOPOULOS 


Obviously 
anv) ; 
(3.2) nis = SP + Philx, x’ )meaXe. 


The n**,,) are not tangential to F,,, in contrast to Riemannian geometry, 
and this is the source of much of the difficulty in the derivation of the Gauss- 
Codazzi equations. 


By differentiating the relation (1.7) with respect to w® and combining the 
result with (3.2), we obtain 


a , ’ a’x' 
EG W3Xe + gis(x, x')Xa(mih.s — Prix, x’)mhXs) + gob 5 aaa = 0, 


or, after rearranging the terms, 


. 


- a*x* re] 
eases Xing + eusiths sos + wih xe{ 284 — g.PHXS) = 0. 


We add and subtract g,,;P*',,X."X,* in the left-hand side of the above relation, 
thus obtaining 


: a’ i 
(3.3) £1s(X, x )\Xinch.s + ets 2, + Pros 2x3) 


re] 
a 
7 nsx toy — guPnX$ — eyPLX}) = 0. 


The term in the last bracket of (3.3) represents the covariant derivative of 
Zi (x, x’) with respect to x*, multiplied by X,*; if we put 


Cry.n (2, x’) = Bnj.n(X, x’), 


we may write for (3.3), 


(3.4) £15(x, x’ )Xin*/ + Yn Qirap + Ch aX pX ane = 0. 
We decompose n*/ », which is not tangential to F,,, as follows: 
(3.5) nih.e = BineXi+ 2 Nionis. 


(*) 


In order to find By,)s*, we multiply (3.5) by g,,X.‘, in view of (1.7) and 
(3.4), we have 


(3.6) Bing = — Wr Vnasg™* — Cine (x, x’)g"XsXanyn. 
To obtain the N’s we multiply (3.5) by n*),, then 
(3.7) Mis).6Mr) 4 = N&e¥oo- 


The N’s are not independent since they satisfy some symmetry conditions 
which we obtain in the following way. We consider the relations 
£1;(x, x)nGinos = Wudr (no summation is involved in ,), 


and we differentiate them with respect to ug; between the relation which 
we find and the equation (3.2) we eliminate 





2 eee oe ee 


ae 





thi 


ar 


(3 








SUBSPACES OF A METRIC SPACE 249 
ony) 
au?’ 
thus 
C) tg(x, x’ i *s 4 *j _*4 
(3.8) =<" NiywyMoary + £13(x, x’) [mu).6%a) + 2G) sw) 
#4 +h *4 yk +h #1 yk Ow) 
— Pru (x, x’)ngnawXs — Phi (x, x’ )noynwXs)] = ou? 5,. 


If we use the relation (3.7), we find 
t) 
(3.9) Yoon Se + YurN Se = EH a. — CramhbnisX. 
In conclusion, we have the covariant derivative of n*;,) given by 
* 
(3.10) moh.8 = VrXig"Quras — Cong Xin + 2) NGievoo, 
(A) 


where the quantities Nq)s™ (vectors in 8) satisfy the symmetry conditions (3.9). 
In the case of a hypersurface F,,, the equation (3.10) becomes identical 
with (4.9) (4), the relation (3.9) giving 


1 @ 
Ne = t=, = Cian” ‘n® X§. 


The equations (3.10) suffer from the disadvantage that the terms ¥y)» in 
the right-hand side involve the derivatives of the tangent x’‘ to the curve 
along which we are differentiating, so that (3.10) depends on the curve under 
consideration. 

As in the case of the n*‘, we define the covariant derivative m,)‘ by 
projecting m,).‘ onto F,,, 


on‘ , 
(3.11) Ming = MinaXs = SP + Phele, x’)mtoX5, 


where x’‘ is some direction tangential to F,, at the given point. Here we obtain 


(3.12) Qwae = — Cur snXpXani,) _ 215(x, Nip) )Xsniyy.p, 
where 
(3.13) Cu) sgn = £15.2(X, My))- 
We decompose the tensor mj) 9’ as follows: 
(3.14) nin.6 = AlwpXt + D> viSenio: 
(«) 


multiplying (3.14) by gi, (x, mj))Xa‘ we find that 
(3.15) A ins _— VG Dwas wa V¥erCwinXpXantyy, 
and therefore 


a aé (ws) 
(3.16) méy.0 = — VE>QurapXb — VEC aeXiXEXinty + DD viBonin. 


(*) 











250 H. A. ELIOPOULOS 


In order to obtain more information on the »’s we multiply (3.14) by may),, 
then 


j 9 ) 
(3.17) niostory = 2, rBenionm, = DL rBsawa 
(«) («) 


where 
4 
2@a = MeyMays = COs (Na), Ma). 


We note that in general a,.)a) # @a)(.. Assuming that the determinant |a;,)q)| 
is different from zero, we can solve the system (3.17) with respect to the 
values of ».) ™ and we obtain the v's as linear combinations of the expressions 
Nin) g°Nay, that is, 

Awm 


(3.18) vine = , i “—— (nd,) mas); 


(a) A 


where A“® is the cofactor of the determinant |a;.)q)| corresponding to the 
element a;,)@) and A is the value of that determinant. 

As an application of the above theory we may obtain Rodrigues’ formula 
of classical differential geometry. 

If we consider the relation (3.16) we may write 





Dni,) 3 8 ab l (s) r) 
(3.19) _ = — Vq)M%p apt Xi- 1wCw waX XX" nt) + > v(nem nyt” ’ 
(*) 
SINCE My)s = Kix (X, My) Mw)’, we obtain by differentiation 
Dnw): j Dnis) 
(3.20) _ = Cou) 1p2Bin) + £15(x, Niy)) Ds ’ 


and substituting (3.19) in (3.20), we find 


Dnw): 


3 By J (x) 9B 
_— Bis (X, Mew VX epte”X5 + DO vou” g.s(x, my) nie. 
(k) 
Multiplying the above equation by X,‘', we obtain 
plying q 

+ Dry): 8 (s) . B ‘ ‘ 

(3.21) Xa Ds = Qip) apt + y V(«)pu £15(x, Ny) ) Ne) Xa- 
(k) 


If Ry)! (x, x’) is the normal curvature corresponding to a principal direction 
x’* of F,, and to a normal m,), we have from equation (2.21) 


gap(u, u’)u’® = Ri) (x, x’)Qyraptt’® . 
Combining the above relation with (3.21), we write 


Dri = — [Ry (x, x’)]""gas(u, u’)u” 
Ds Bs aB\, 


(s) 8 j yt 
+ p » V(x) (py £15(x, Ny) ) Mi Xa- 
(k) 


(3.22) ps 


For m = n — 1 (hypersurface F,_,), the second term in the right-hand 
side becomes zero. Indeed in that case the hypersurface has a unique normal 
n, and therefore, the sum 














SUBSPACES OF A METRIC SPACE 251 


£1s(X, Mw Min Xa 
is reduced to gy, (x, s)m’X,‘ which is identically equal to zero. But in the 
case of any subspace F,,, the second term does not vanish, unless we choose 
a particular set of normals m,) such that 
(w) 8 

> V(urpte” Bis(X, My) )MigXa = O, 

(« 
then 


D : , _ , , 
x= = [Ry (x, x )] * gap (1, u )u . 


putting gas(u, u’)u’® = y,, that is, introducing the covariant component in 
F,, of u’®, we find 


Dni siasiadl 
(3.22) Xe P = — (Rew(@, =’) Ye 
The above formula is analogous to Rodrigues’ formula and it is similar to 
the one for a hypersurface (4). 


4. The Gauss and Codazzi equations for an F,. We may obtain 
relations connecting the curvature tensor of the space F, with the curvature 
tensor of F,, and the coefficients 2*. To do so, we first consider the covariant 
derivative of X,‘ with respect to w° (metric in F,,), 


ax: 
a6 = 35 Prix. 


Combining the above relation with relation (2.3), we obtain 


(4.1) Xie = Xda — PeX2X$ 
or, because of (2.12), 
(4.2) 2.8 = > ase) = PueX2X5. 


(pe) 


We know that 
Xho — Xive = Rieg,Xi, 


where R*.9, is the curvature tensor of F,, corresponding to the induced 
connection coefficients P,,**. By using the expression (4.2), we can write 


OP , OPhe ) “a (2 a oie) | 
(4.3) ReasrX' = xg (22 ax”? dau’ Xs — + 4 au x 


= Ph (X0.4X6 — Xe pX7) )+ > (Qe ras in).7 are Qnrey M8) 
+ } Mis) (Qiprab.7 a nrer.6) 


(#) 
+ > » Pring) (QirarXs —_ QrapX3) . 


(m) 











252 H. A. ELIOPOULOS 


With the help of (4.2) the second term of (4.3) can be written 
X2XbXy(PesPai — PisPrd) — Do Pani} QirerX — MyapX7), 
therefore, (4.3) becomes 
apr! 4 Phe ) : (285 apr, sa) | 
(4A) Regt = x9 (SER 4 2285 28. ay — (SEB 4 OF a 
+ X2X5Xi (Pr Pri — — PijPu ) + } (QirasBo.7 — rer .6) 


(s) 
+ > nu (Qyras.1 one Vurer.s): 


() 








the first and the second term in the above equation may be substituted by 
Rai (x, x’)X2X3X! 

according to (A.7), where R,,;‘ is the curvature tensor of the space F,. The 

equation (4.4) then becomes 


(4.5) XiRiapy(u, u") = Raerlx, x’)X2XbXy + Do (QMrastid.r — Urey.) 
+ p> Nip) (Qiarad.7 — Qrer.6)- 
_ 


If we use (3.12), we may eliminate the derivatives of m*,,) from (4.5) and 
thus we obtain 


(4.5a) XjR s,(u, u’) = Royilx, x’)X2X5X!} 
+ > Yin (Qiras yey — Qa Qiu ey )X ag - 


— Chg? do (Qo 0aXF — WyarX5)nw + Do DO (NG Mras 


(a) (s) ™ 


(ws) * *i * * *i 
— NX Qoar)no5 + DO (Qiras.y — Unrar.s) Me 


(m) 
Multiplying the above equation by g,,; (x, x) X,’, we find 
(4.6)  gaRiapy(u, 0’) — Dy Yur (QinrasMony — Ura Mors) 


(s) 
= gis(x, x’ )Rnes(x, x’ )X2X5XIXY — XFCS (MGrasX* — MirayX5) MD 


and multiplying the same equation by g;, (x, x’) n*’,,), we get 
(4.7) £15(x, x’) mp Rnei(x, x" )X2X6X} = Con p> (QiurasXs = MurayX5) Myon 


+ ba Yin (NQiras — Ny Qinar) + vate ea Qrar.6) = 0. 


(#) 


The equations (4.6) and (4.7) represent a generalization of the Gauss-Codazzi 
equations of Riemannian geometry. 

It is obvious that different forms of Gauss-Codazzi equations are obtained 
when one considers the fundamental forms Q;,)2sdu*du® together with the 
normals (x). For that purpose, we decompose the vector Xas‘ (considered 
as a vector with respect to the upper index i) into components along the 
normals m and the tangent plane at the considered point. We put 





MS Cero 








se 


— 





Terre 





SUBSPACES OF A METRIC SPACE 253 


(4.8) Xa = DL Aweasnt) + Wis 


(*) 


where W.,,‘ satisfies the condition 


(4.9) West»: = 0, 

and by multiplying (4.8) by m,,),, we obtain 

(4.10) Qi») as = ry Xap yi A (s)ap COS (ni), N)), 
(*) 


hence W,s' is given by the relation 
(4.102) Wi =X — DY Awesti = Dy Wastes — D> Auras». 

(s) (s) ») 
Since the vectors m‘ are in general different from the vectors n**‘ and they 
do not belong in the space spanned by m*‘, we look for a decomposition of 
the ns along the n*‘ and the vectors defining the tangent space to F,,. We 
decompose the vector m‘ in the form 


(4.11) ny = >, THnds — MDX; 


(A) 


multiplication by m,,)‘ provides 


t 
(4.1la) Niy)Nw)i = Th cos (nw), na)); 
(A) 


from (4.11) we also obtain 
(4.12) Mi.) = niyX'. 


Combining (4.11) with (4.10a) and also (4.1la), (4.10), we may write 


(4.122) Wi = Do Uap — D> dD (AweasTS) no + Xi DY AwrasMt», 
() 


(A) (m) 


(4.13) Qras = >, ( pm AwasT 8) cos (m», 2a»): 


(A) (s) 
If we compare the equations (4.13) awith (2.2la), we see that 
(4.14) DL AweasT& = DMvas- 
(#) 


In view of the equation (4.14), the relation (4.12a) becomes 
(4.15) Wes = Xi D AwasMiy, 


(s) 


thus, M,,)° is given by (4.12), Aqwas by (4.10) and Wag* by (4.15). 
Using the equations (4.1) and (A.8) again, we write 
(4.1a) a8 = Dy Aastiny + Was — PrrXeX5; 


(#4) 


differentiating with respect to the metric of F,, and because of (3.11) we 
obtain 











254 H. A. ELIOPOULOS 


(4.16) Xie, = p> Awastisy., + Wea — D> Pr'Awrasnt.X? 


(a) 
OPm , OP mi ax"! 
(2% + ax! =F X2Xs — Phe. 2X - me XeXhy + > A wat.r%, 
(#) 





or 
(4.17) So (Awes.x — Awer.s)®i) + p> (A wragtts).. — A waytty,6) 
* 
+ X2X$XIR ai (x, x’) + Wesy — Wop — Pr'(W2,X5 — WigX3) = R%yXi. 


Using the expression for the generalized covariant derivative of W.s' with 
respect to u’ we find 


(4.18) XiR%, = > (A wa8.y — Awar.s) Min) + > (A wastn).y — Awrar®y.0) 
+ XeXbXzRinei(x, x’) + Weoy — Wer, 

which, with the help of (3.16), can be written 

(4.19) Xi[Rey, — >> Vis) (Quy Gay — MyyerA wras)] = Rivi(x, x’)X2X$X} 
- > Bis) C eu ana) (A wrapXy — A wrarXs) + p> (A wres.y — Awa.) Bi) 
+ DOD 2A rash, — Awarins) + Wer, — Wang. 


@® 


The relation (4.19) is important because it provides the Gauss and Codazzi 
formulae. Indeed, multiplying (4.19) by g,; (x, m,»))m,)’? and putting 


Dw) m = g*'(x, My) )B13(X, cry) Cow) anes 


_ (#) 
Miy)(»)y = p> PA) COS (i>), May), 
(A) 


we obtain the final equation 
(4.20) > O15») (A wab.y — Awar.s) = > Dw ernest (A wrasXy 
— AwerX$) 
= > (100) wr7A (was — Mey)(r8A (rer) — RavrXaXX pmo 1 
— (Wes, — Ware)ess(x, ™»)nl,). 


It is possible to remove the terms involving W,.,s,‘ and replace them by ex- 
pressions depending on Aagg or Qag. 


Indeed, 


(4.21) Wis, = xi(X AwaryMin + L AwasMiny1) 
+ (x A waeMin )X bn 


and since X4,‘ = 2,)Aunw' + Ws,' and m,),Ws,' = 0, we obtain instead 
of (4.20), 











(4 


ob 


by 
(4 


fo 


Ge 





Si wren 


ie eel 











SUBSPACES OF A METRIC SPACE 255 


> Qy)(») (A wrat.y — Awer.s) = > Dou») aby (A GrapXo — A wrarX5) 
« * 


(4.22) - > (91¢») (yA (as — (rp Wey) 
s 
ae p> > M0.) 0)(A wate ary — AwayA ays) 
s 
= RieX2X5Xint,. 


We consider again the equation (4.19). Multiplying by g,,(x, m))X;’ we 
obtain 
(4.23) -yense{ Rosy — 2 Vin (Qed wer — MererA woes) | 


(A) 


Rini(x, x’ XEX$X LX tg (x, N»)) 
= > g* (x, Mp) )£15(*, 14») Cow) ane (A uae Xy _ A warX$)X4 


(se) 


+ ys (A (s)eB.y ~~ A way.8 8 15(; Mi») )X tnt») 
(pm) 
+ , a i £13(x, nm») )X fndy(A (@ab¥O) ey ~~ A wey? we) 


(Be) a 


+ (Wasy _ Wes) 8 13(*, mo» )XE; 


by eliminating the derivatives W.s,‘ we find a relation 
(4.24) gu,(x, Mm») )X t(Wasy _ Wes) = ror] (A wet.y — Aay.s)Mis) 
oe 
+ Zo AwasM'y — AvrerM's| + ty, mo) X] LL Awadorn 


(s) (s) O 


— AwerAow) - Minny + XID DS (AweaAon — Award ove) Min Min | . 


(j@* O® 


for the last term of (4.23). 


The relations (4.22) and (4.23) thus represent alternative forms of the generalized 
Gauss and Codazzi equations. 


REFERENCES 


1. H. S. M. Coxeter, The real projective plane (Cambridge, 1955). 

2. L. P. Eisenhart, Riemannian Geometry (Princeton, 1949). 

3. H. A. Eliopoulos, Methods of generalized metric geometry with applications to mathematical 
physics, Ph.D. thesis, University of Toronto, August, 1956. 

H. Rund, Hypersurfaces of a Finsler space, Can. J. Math., 8 (1956), 487-503. 

Ueber die Parallelverschiebung in Finslerschen Raumen, Math. Z., 54 (1951), 115-128. 

On the analyiical properties of curvature tensors in Finsler spaces, Math. Ann., 127 

(1954), 82-104. 
7. A. W. Tucker, On generalized covariant differentiation, Ann. Math., 32 (1931), 451-60. 





4. 
5. 
6. 





Assumption University of Windsor 











ON THE IRREDUCIBILITY OF CONVEX BODIES 
A. C. WOODS 


1. Introduction. We select a Cartesian co-ordinate system in n- 
dimensional Euclidean space R, with origin O and employ the usual point- 
vector notation. 

By a lattice A in R, we mean the set of all rational integral combinations 
of m linearly independent points X,, X2,..., X, of R,. The points X1, X2,..., 
X,, are said to form a basis of A. Let {X,, Xo, ... , X,} denote the determinant 
formed when the co-ordinates of X, are taken in order as the ith row of the 
determinant for i = 1,2,...,. The absolute value of this determinant is 
called the determinant d(A) of A. It is well known that d(A) is independent 
of the particular basis one takes for A. 

A star body in R, is a closed set of points K such that if X € K then every 
point of the form 4X where — 1 < ¢ < 1 is an inner point of K. A star body 
K is called a convex body if it is bounded and satisfies the convex property: 
if X € K, Y € K then 4X + (1 — #)Y € K provided 0 S$ # S 1. It is further 
called strictly convex if X € K, Y € K implies that tX + (1 —#)Y is an 
inner point of K when 0 < ¢ < land X # Y. 

Let A be a lattice and K a star body in R,. We say that A is K-admissible 
if no point of A other than 0 is an inner point of K. If K is such that no K- 
admissible lattice exists then K is said to be of the infinite type, otherwise K 
is said to be of the finite type. If K is of the finite type the number inf d(A) 
extended over all K-admissible lattices A is called the critical determinant 
A(K) of K and any K-admissible lattice A of determinant d(A) = A(K) is 
called a critical lattice of K. It is well known that if K is of the finite type 
then at least one critical lattice of K exists. 

Let K be a star body of the finite type in R,. If K is such that any star body 
properly contained in K has a smaller critical determinant than K has we 
say that K is S-irreducible; otherwise K is said to be S-reducible. 

Let K be a convex body in R,. If K is such that any convex body properly 
contained in K has a smaller critical determinant than K has then we say 
that K is C-irreducible; otherwise we say that K is C-reducible. 

The property of S-irreducibility was first studied by Mahler (1) who gave 
necessary but insufficient conditions for a star body to be S-irreducible. Later 
(2) he considered the property of C-irreducibility and showed that if n = 2 
then any C-irreducible convex body is also S-irreducible. Rogers (5) then gave 
a set of necessary arfd sufficient conditions for S-irreducibility which will be 
stated later. 


Received July 3, 1958. 
256 








= ee 





—- -.« ee = —. **r FF OD oe 














IRREDUCIBILITY OF CONVEX BODIES 257 


The purpose here is to give an example of a convex body in R; that is 
C-irreducible but not S-irreducible. The proof that the example has these 
properties relies to a large extent on the work of Whitworth (6). To clarify 
the picture regarding C-irreducibility we formulate a set of necessary and 
sufficient conditions for C-rreducibility analogous to the set given by Rogers 
for S-irreducibility, the proof following similar lines. 


2. The set L(K). The results stated in this section are classical. 

Let K be a convex body in R,. We define L(K) to be the set of all points 
X of the boundary of K such that if X is contained in any line segment of 
the boundary of K then X is an endpoint of the line segment. Such points 
are sometimes called extremal points of K so that L(K) constitutes the set 
of all extremal points of K. As K is symmetric in 0 it is evident that L(K) is 
also symmetric in 0. Further: 


LemMMA 1. The convex hull of L(K) is K. 


LemMMA 2. Given X € L(K) and « > 0 there exists a convex body K(e) C K 
such that X ¢ K(«) and such that any point of K — K(e) lies within a distance 
« of one of the two points + X. 


3. C-irreducibility. Let K be a star body in R,. Further let A be a critical 
lattice of K. Let X be a point of A on the boundary of K. We say that A is 
free at the point X if, given « > 0, there exists a lattice A(e) of determinant 
d(A(e)) < d(A) = A(K) such that the interior of K contains no point of A(e) 
apart from 0 and any that are within a distance ¢ from one of the two points 
+ X. Rogers’ criterion for S-irreducibility is then as follows: 


Lemma 3. K is S-irreducible if, and only if, to each point of the boundary 
of K there corresponds a critical lattice of K that is free at this point. 


We now give an analogous criterion for C-irreducibility. 


THEOREM 1. Jf K is a convex body then K is C-irreducible if, and only if, to 
each point of L(K) there corresponds a critical lattice of K that is free at this 
point. 


Proof. (i) Only if: Assume that K is C-irreducible and let X be an arbitrary 
point of L(K). By Lemma 2 given « > 0 there exists a convex body K(e) C K 
such that X € K — K(e) and such that any point of K — K(e) is within a 
distance « from one of the two points + X. Since K(e) is properly contained 
in K it follows that A(K(e)) < A(K). Hence there exists a critical lattice 
A(e) of K(e) of determinant d(A(e)) < d(A). It is evident that K contains 
no point of A(e) in its interior other than 0 and any that may lie within a 
distance ¢« from one of the two points + X. Moreover A(e) is certainly not 
K-admissible and therefore taking into account the fact that K is symmetric 
in 0 we conclude that there must be a point of A(e) in the interior of K and 











258 A. C. WOODS 


within a distance « from the point X. The sequence A(m™') of lattices is 
compact in the sense of Mahler (3) and so contains a convergent subsequence 
with the limit A’ say. But lim,...K(m-') = K and A(n™') is a critical lattice 
of K(n-") for each n, hence A’ is a critical lattice of K. Further each A(n~) 
contains a point within a distance n—' from the point X. Thus A’ contains X 
which implies that A’ is free at X. As X was chosen an arbitrary point of L(K) 
this proves (i). 

(ii) If: Assume that to each point of L(K) there corresponds a critical 
lattice of K that is free at this point. Take an arbitrary convex body K’ C K 
such that K’ # K. There exists a point X € L(K) — K’ for otherwise 
L(K) C K’ and so by Lemma 1 K’ = K contrary to hypothesis. Let X € L(K) 
— K’ be fixed. As K’ is closed there exists « > 0 such that no point within 
a distance ¢ from either of the two points + X is in K’. By hypothesis there 
exists a critical lattice A of K such that A is free at the point X. In particular 
this implies that there exists a lattice of determinant d(A(e)) < d(A) = A(K) 
such that no point of A(e) apart from 0 and any that may lie within a distance 
¢ from one of the two points + X is an inner point of K. Hence A(e) is K’- 
admissible from which it follows that A(K’) S d(A(e)) < A(K). Whence K 
is C-irreducible. This completes the proof of the theorem. 


4. An Example. In looking for a convex body that is C-irreducible and 
S-reducible we may by Mahler’s result confine our attention to dimensions 
n = 3. Further if K is a strictly convex body it is obvious that L(K) is the 
whole boundary of K. Hence using the previous results K is C-irreducible if, 
and only if, it is S-irreducible. Again, Dr. Kathleen Ollerenshaw has obtained 
the following two results (4): 

(a) The n-dimensional parallelopiped is S-irreducible for every n. 

(b) If K is a two-dimensional S-irreducible convex body then the three- 
dimensional cylinder on the base K is also S-irreducible. 

A more suitable candidate for our purpose has proved to be a sawn-off 
three-dimensional cube. Whitworth (6) has shown that the convex body K 
in R; defined by the inequalities 


jxa| S 1, \xe| <1, lxs| S$ 1, \x1 + x2 + x3| S 3 


has the critical determinant A(K) = 3/8. He has further determined all the 
critical lattices of K. It is necessary to give a table of these here but before 
doing so we remark that K has the six automorphisms obtained by per- 
muting the co-ordinates together with the reflections in 0. Thus given any 
critical lattice of K we obtain six when we apply these transformations. In 
the following table the only critical lattices of K not included are those 
obtainable from the ones stated by applying the above automorphisms of K. 
There are three classes: 


Class I: A(p, o, 8) of basis X; = (p — 4,0 — 1,8), X2 = (p,¢ — 3,8 — 1), 
X; = (9 — 1,¢,8 — 4) where p+ o¢+ 8 = 2. Another basis for A(p, a, 8) 














a = =. oF a = 


, 
) 








IRREDUCIBILITY OF CONVEX BODIES 259 


would be X2, X2 — X; = (4, 4, — 1), X3 — X2 = (— 1, }, },). The points 
X2 — Xi, Xs — X, lie in the plane x, + x, + x; = 0 while X, lies in the 
plane x; + x2 + x3 = 4. Hence all points of A(p, ¢, 8) that lie on the boundary 
of K are confined to the three planes x; + x, + x; = 0 or + 1/2. It follows 
that the same is true of the automorphic images of A(p, ¢, 8). 


Class II: A(A,u,8) of basis X, = (1, — 4, — 4), X2 = (— 3,1, — 4), 
X; = (—A, — 4,8) where A+u—-6B=}3,0< —685}3, 0 Su }, 
0 SA S }. The points X;, X; lie in the plane x, + x, + x, = 0 while the 
point X; lies in the plane x; + x2 + x; = — 4. Hence all points of A(A, u, 8) 
that lie on the boundary of K are confined to the three planes x; + x2 + x3 = 0 
or + 4 and the same is true of the automorphic images of A(A, u, 8). 


Class III: (i) A(v, v2, x1, x2, 8) of basis X; = (—»,8, —x1), X2= 
(— v2, 1 — 8B, — x2), Xs = (1, — }, - 4) where »; + »2: = }, utx= }, 
8B —v. — x: = + 4. The points X,, X; lie in one of the planes x; + x, + x; 
= + 4 while the point X; lies in the plane x; + x, + x; = 0 and hence all 
points of A(v1, v2, x1, x2, 8) that are on the boundary of K are confined to 
the planes x; + x, + x3 = 0 or + }. 

(ii) A(A) of basis X, = (1, — 4, — 4), X2 = (—A, — 3,1), Xs = (4, 0,0). 
Evidently the points of A(A) that are on the boundary of K are confined to 
the lines given by (t, — 3%: — $2, — $u; + u2) where u:, wu, have one of the 
following pairs of values: (0,0), (1,0), (— 1,0), (0,1), @©,—1), (1,0), 
(— 1, — 1), (2,0), (— 2,0). Hence the points of all the automorphic images 
of A(A) on the boundary of K are confined to the lines given above together 
with those obtained from them by permuting the co-ordinates. 

(iii) A of basis X; = (— 3,1, — 4), X2 = (4, — 4, — 4), Xs = (4, 0,0). 
The point X;, lies in the plane x; + x, + x; = 0, X2 in x1 + x2 +x; = — }, 
Xs in x; + x2 + x3 = 4; hence all points of A that are on the boundary of K 
are confined to the planes x; + x, + x3 = 0 or + }. It follows that the same 
is true of the automorphic images of A. 


This completes the table of the critical lattices of K. We are now in a 
position to prove: 


THEOREM 2. K is C-irreducible and S-reducible. 


Proof. We show first that K is S-reducible. From the table given above 
we see that the only critical lattices of K with points on the boundary of K 
that do not lie in one of the three planes x; + x2 + x3; = 0 or + 3 are those 
in Class III (ii). The point (1, — 4, — 4) is on the boundary of K and in the 
plane x; + x2 + x; = 4. Therefore if it is a point of some critical lattice of 
K it must be in Class III (ii). However, it is obvious that no lattice of this 
class can contain (1, — 4, — 4) nor can any lattice which is derived from 
one of those stated by permuting the co-ordinates. Therefore (1, — 4, — 4) 
belongs to no critical lattice of K. By Lemma 3, K is S-reducible. 











260 A. C. WOODS 


We now show that K is C-irreducible. The set L(K) consists of the twelve 
points obtained by permuting the co-ordinates of the point (1, 4, — 1) and 
taking the six points thus obtained together with their reflections in 0. Hence, 
by virtue of Theorem 1, K is C-irreducible if we can show that there exists 
a critical lattice of K which is free at the point (1, 4, — 1). Take the lattice 
A(4, 0, 3/2) in Class I of the table above. A basis of this lattice is X, = (1, 
— 1,4), X2 = (3/2, — 4, — 4), Xs = (4,0,0). Another basis would be 
Y, = X;-—-Xi= (4, 4, -— 1), Y, = X; —X;= (3, —= 1, 4), Y; = X3. The 
points of A(4,0,3/2) on the boundary of K are Vi, Yo, Ys, ¥1 + Y2 = (1, 
— 4.— $), Y,+ Y,;= (1,4, — 1), y,- Y; = (0, 3, — 1), Y,-—Y;= (0, 
— 1,4), Ye+ ¥3: = (1,-—1,4), ¥i+ ¥2 — Ys = (4, — 3, — 4) together 
with their reflections in 0. In particular we see that Y, + Y; = (1, 3, — 1) is 
a point of the lattice. For a given 5 > 0 denote by A(é) the lattice of basis 
Y,’ = (} — 4, 3, — 1), Y2’ = (4 + 4, — 1, 4), Ys’ = (4 — 5,0, 5). Evidently 
as 6-—+0 so Y;' — ¥;, Ys’ — V2, Ys’ — Ys and therefore also A(é) — A(}, 0, 
3/2). Moreover, 


oe 2. ; eat oe 
+38 -1 
Pica @ 3 


provided only that 4 is sufficiently small. Since in the limit 6 0 the basis 
given for A(é) becomes the basis given for A(4, 0, 3/2) it follows that for all 
sufficiently small 5 the only points of A(4) that can lie in the interior of K 
are 
Y,’ = (4 — 6, 3, — 1), Yo’ (4 + 4, — 1, $), Y3’ = (4 — 6,0, 5), 
Y,’ + Y,’ = (1, = }, —_ 3), Y,’ + Y;’ == (1 = 26, 4,6 = 1), Y,’ “b Y;’ 

= (1l,—1,4+5), 
Y,’ — Y;’ = (0, ;, - 1 = 5), Y,’ —_ Y;’ = (26, = fe 4 — 5), Y,’ — Y,’ - Y;’ 
together with their reflections in 0. But it is clear that the only ones in the 
interior of K are + (Yi’ + Y;’). Moreover 


lim (Y;’ + Y;3’) = (1, 3, —1), 
6k 


hence A(3/2, 0, 4) is free at the point (1, }, — 1). Therefore K is C-irreduci- 
ble. This completes the proof of Theorem 2. 


Part of this work is extracted from a thesis for the degree of Doctor of 
Philosophy at the University of Manchester, written under the supervision 
of Professor K. Mahler to whom I am very grateful for advice and encourage- 
ment. 





ek Se 








ee a a a 


il 


ie 





= oe wer 





owe 


—— 





IRREDUCIBILITY OF CONVEX BODIES 261 


REFERENCES 


1. K. Mahler, Lattice points in n-dimensional star bodies II, Reducibility theorems, Proc. Nederl. 
Akad. Wetensch., 49 (1946), 331-43. 








2. On irreducible convex domains, Proc. Nederl. Akad. Wetensch., 50 (1947), 98-107. 

3. Lattice points in n-dimensional star bodies, I, Existence theorems, Proc. Roy. Soc. 
London, Ser. A, 187 (1946), 151-87. 

4. K. Ollerenshaw, Irreducible convex bodies, Quart. J. Math., Oxford (2), 4 (1953), 293-302. 


nn 


. C. A. Rogers, A note on irreducible star bodies, Proc. Nederl. Akad. Wetensch., 50 (1947), 
868-72. 


6. J. V. Whitworth, On the densest packing of sections of a cube, Ann. Mat. Pura Appl., Ser. 4, 
27 (1948), 29-37. 


Tulane University of Louisiana 











DENSE SUBGRAPHS AND CONNECTIVITY 
R. E. NETTLETON, K. GOLDBERG, anp M. S. GREEN 


A proper subgraph of a connected linear graph is said to disconnect the 
graph if removing it leaves a disconnected graph. In this paper we characterize, 
in the following sense, the disconnecting subgraphs of a fixed connected graph. 
We define two distinct types of disconnecting subgraphs (isthmuses and 
articulators) which are minimal in the sense that no proper subgraph of 
either type can disconnect the graph. We then show that any disconnecting 
subgraph must contain either an isthmus or an articulator. We also define a 
set of subgraphs (called dense) which form a lattice. We show that the union 
of the minimal dense subgraphs contains all isthmuses and articulators. In 
terms of these subgraphs we investigate some of the consequences of assuming 
that a disconnecting subgraph must contain at least m points. 


1. Definitions. A (linear, undirected) graph G is a finite set of elements 
Pi, P2, .--, Pa called points, and a set of ordered pairs of these elements defining 
a symmetric, non-reflexive binary relation. Two points occurring in an 
ordered pair are said to be neighbours. A subgraph of G is a subset of the 
points of G together with all the ordered pairs in G containing only elements 
of the subset. A subgraph is thus determined by its set of points when the 
binary relation of G is understood. 

Two distinct points, p and gq, in G are said to be connected by a path of 
length k if there exist k + 1 distinct points p = pi, p2,..., Dey: = G such 
that the ordered pairs (p;, Pis1), for i = 1,2,...,%, are in G. The distance 
between two points is the length of the shortest path between them. The 
diameter of the graph is the greatest distance between pairs of points in the 
graph. 

A graph having only one point or more than one point and every pair of 
points connected is also called connected. If every pair of points are neigh- 
bours the graph is called completely connected. A graph which is not connected 
is called disconnected. The null graph is disconnected. 

The union, intersection, and difference of two subgraphs G, and Gz, is the 
subgraph whose set of points is the union, intersection, or difference of the 
sets of points of G,; and G2. We denote the union by G; + G;, and the difference 
by Gy —_ Go. 

If a graph is not connected it is the union of a set of disjoint subgraphs 
each one of which is connected and such that the union of any two is not 
connected. This set is unique and we refer to it as the partition of the graph. 


Received March 27, 1958. 
262 





ervieurern 


Ss 


= 





— hla 











DENSE SUBGRAPHS AND CONNECTIVITY 263 


We say that a proper subgraph G’ of a connected graph G disconnects G if 
G —G’ is disconnected. We shall be interested in ways of disconnecting a 
fixed connected graph G containing m points and to this end we introduce 
two definitions. 

A k-isthmus of G is a completely connected subgraph which has & points, 
disconnects G, but does not properly contain a completely connected subgraph 
which disconnects G. A k-articulator of G is a subgraph G’ which has & points, 
disconnects G, is not completely connected, and has the property that each 
subgraph in the partition of G — G’ has a neighbour of each point in G’. We 
shall use the generic terms isthmus and articulator when the number of points 
is irrelevant. 

For example, if we denote G pictorially with lines representing the relation 
between points we can see the isthmuses and articulators in the following 
connected graphs: 






































1 2 1 2 1 2 
3 a 3 o \ 3 - 5 
Ficure 1 FiGuRE 2 Ficure 3 


In Figure 1 the subgraphs with point sets {1, 4} and {2,3} are articulators 
but there are no isthmuses. In Figure 2 {2, 3} is an isthmus but there are no 
articulators. In Figure 3 {2, 4} is an isthmus and {1, 4} and {2,3} are arti- 
culators. 

We now define a type of subgraph which we shall prove has a close con- 
nection with the isthmuses and articulators of G. A connected subgraph G’ 
is called dense if G’ = G or if every point in G — G’ has a neighbour in G’. A 
dense proper subgraph which is contained in no other dense subgraph except 
G we call D-maximal; a dense subgraph containing no other dense subgraph 
we call D-minimal. We let Sp denote the collection of dense subgraphs of G 
ordered by inclusion together with the empty graph ¢. We let I’ denote the 
union of all D-minimal subgraphs of G. Unless otherwise stated all dense 
subgraphs, isthmuses, and articulators are those of G. 

We call G m-connected if G — G’ is connected for every subgraph G’ con- 
taining fewer than m points. 

The subgraph of neighbours of a point p in a subgraph G’ we denote by 
G'(p). 


2. Dense Subgraphs. Suppose G; is a dense subgraph and G; is a subgraph 
containing G,. Since every point in G; — G, has a neighbour in the connected 
graph G; it follows that G, is connected. Also every point in G — G2 has a 
neighbour in G, (in fact in G,). Therefore we have 


LEMMA 2.1. A subgraph which contains a dense subgraph is also dense. 














264 R. E. NETTLETON, K. GOLDBERG, AND M. S. GREEN 


Let G, and G, be two dense subgraphs. By this lemma their union is also 
dense. Their intersection need not be dense but if it is not it cannot contain 
a dense subgraph, again by this lemma. Therefore we have 


THEOREM 2.2. Sp is a lattice in which the |.u.b. is the graph union, and the 
g.l.b. is the graph intersection when the intersection is dense and otherwise is $. 


Applying Lemma 2.1 to the definition of D-maximal we get 


LEMMA 2.3. A subgraph is D-maximal if and only if it is connected and 
contains n — 1 points. 


Suppose » > 2 and let d denote the diameter of G. Let p; and p2 be points 
such that the distance between them is d. If G — p; is not connected let 
{G,} denote its partition. Suppose » is in G, and let p; be a point in G2 which 
is a neighbour of »;. Any path between 2 and p; must pass through #; since 
removing ); disconnects an otherwise connected graph. Thus the distance 
between p: and p; is d + 1 which is a contradiction. It follows that G — py, 
and G — p2 by symmetry, is connected and so a D-maximal subgraph. We 
have proved 


THEOREM 2.4. If n > 2 then G contains at least two D-maximal subgraphs. 
Thus every point is contained in a proper dense subgraph. 


We shall show that if G = [ and the D-minimal subgraphs are mutually 
disjoint then G is completely connected. First we need 


LemMaA 2.5. Let { T',} be a collection of mutually disjoint dense subgraphs whose 
union is G. If at least one of the T; contains more than one point then there is a 
dense subgraph containing none of the T;. 


We shall prove this by constructing the desired dense subgraph G’. 

Let T;, Ts,..., 0, be those subgraphs among the I, containing more 
than one point. 

If s = 1 let G’ denote an arbitrary D-maximal subgraph of I;. Every 
point in T, has a neighbour in G’ and every other point in G is dense and so 
is a neighbour to every point in G’. Thus G’ is a dense subgraph of G properly 
contained in T, and disjoint from the other I,, as desired. 

Now suppose s > 2. Choose an arbitrary point g in T;. Let gq; = q and q, 
be a neighbour of g in I’, for i = 2,3,..., s. Since each T, (¢ = 1,2,...,5) 
contains more than one point, it contains a D-maximal subgraph G, which 
contains g,;. Let G’ be the union of the G,. 

Each of the G; is connected and each has a neighbour of g or contains q 
so G’ is connected. Let p be an arbitrary point in G distinct from g. Let T, 
be that subgraph containing p. If I’, contains no other points p is a neighbour 
of every point in G’. If I’, has more than one point then either p has a neigh- 
bour in G, (if p is not in G,; or G; contains more than one point) or is a neigh- 





ee 


5 Sees 





— 


POF er es ces 





t 


I 
2 
c 
¢ 





SSS Sa ee 








DENSE SUBGRAPHS AND CONNECTIVITY 265 


bour of g (if p is the only point in G,). Thus G’ is dense and we can complete 
our argument as before. 
We can now prove 


THEOREM 2.6. If G = T and the D-minimal subgraphs are mutually disjoint 
then G is completely connected. 


Given the hypothesis, if any D-minimal subgraph contains more than one 
point we can apply Lemma 2.5 to obtain a dense subgraph not containing 
any D-minimal subgraph. Since this is absurd every D-minimal subgraph 
contains exactly one point. Therefore every point of G is dense and so G is 
completely connected. 

Now we prove 


THEOREM 2.7. If a point p is not D-minimal then T(p) disconnects G. 


_ Suppose G — I'(p) is connected. It contains » and every point in ['(p) is 
a neighbour of » so that G — I'(p) is dense. Thus G — I['(p) contains a 
D-minimal subgraph G’ having no neighbours of ». This is possible only if 
G’ = p. 

We have incidentally proved 


LEMMA 2.8. If G — (G(p) — H) is connected it is dense. 
LEMMA 2.9. G — G(p) is connected if and only if p is D-minimal. 


3. Connectivity. We begin by finding a necessary and sufficient condition 
that a subgraph be an articulator or an isthmus. 

If G’ is an articulator or an isthmus then G — G’ is not connected. Let p 
be any point in G’ and consider G — G’ + p. If G’ is an articulator every 
subgraph in the partition of G — G’ contains a neighbour of p so G — G’ + p 
is connected. Likewise every point in G’ — p has a neighbour in G — G’ and 
so in G —G’+p=G — (G’ — p). Therefore G — G’ + p is dense as is 
G — G” for every proper subgraph G” of G’. If G’ is an isthmus then G’ — p 
is completely connected so G — G’ + p is connected. Every point in G’ — p 
is a neighbour of » so G — G’ + p is dense as is G — G” for every proper 
subgraph G” of G’. 

Now suppose G’ is a subgraph which disconnects G but G — G’ + ? is 
connected for every point p in G’. Then such a point must have a neighbour 
in every subgraph of the partition of G — G’ so G’ is an articulator if it is 
not completely connected and an isthmus if it is. Thus we have 


THEOREM 3.1. A subgraph G’ is either an articulator or an isthmus if and only 
if it disconnects G and G — G" is connected (and so dense) for every proper 
subgraph G” of G’. 


COROLLARY. An articulator does not properly contain an articulator. An 
articulator does not contain an isthmus and conversely. 











266 R. E., NETTLETON, K. GOLDBERG, AND M. S. GREEN 


Let G’ be an articulator or isthmus and let ~ be any point in G’. Then, by 
Theorem 3.1, G — G’ + > is dense and so contains a D-minimal subgraph 
which must contain p since G — G’ is not dense. It follows that every point 
in G’ is in T’. That is 


THEOREM 3.2. All articulators and all isthmuses are contained in TY. 


Let G’ be any subgraph which disconnects G. It is either an articulator or 
an isthmus or, by Theorem 3.1, contains a proper subgraph G” which dis- 
connects G. By repeating the argument on G” we are eventually led to the 
case when G — ? is not connected for a point ». Since such a point p is an 
isthmus we have 


THEOREM 3.3. A subgraph which disconnects G contains an articulator or an 
isthmus. 


We now turn to some of the consequences of m-connectivity and obtain 


THEOREM 3.4. If G is m-connected then 

1. G — G’ is dense for every subgraph G’ containing less than m points, and 
conversely. 

2. G contains no k-isthmus for k = 1,2,...,m — 1 and no k-articulator for 
k = 2,3,...,m —1, and conversely. 

3. IT contains at least m points as does T(p) for every point p in G which is 
not D-minimal. 

4. The intersection of any m — 1 D-maximal subgraphs is dense. 

5. An m-articulator of T is an m-articulator of G. 

6. Either G is completely connected and n = m, or it is not and n > m + 2. 


7. If p is a point in G — T and q is a point in G(p) then G(q) has more than 
m points. 


Let G be m-connected and G’ be a subgraph containing less than m points. 
Suppose there is a point p in G’ without a neighbour in G — G’. Then G — G’ 
+ p = G — (G’ — p) is not connected contrary to the definition of m-con- 
nected. It follows that every point p in G’ has a neighbour in the connected 
subgraph G — G’ which is thus dense. The converse is clear and so part 1 
is proved. 

The necessity of part 2 follows from Theorem 3.3 and the sufficiency is clear. 

Since the second half of part 3 follows from Theorem 2.7 we must show 
that IT contains at least m points. If G is completely connected then G = [ 
so IT is m-connected and thus contains at least m points. If G is not com- 
pletely connected it contains at least one point which is not D-minimal. 
That point has at least m neighbours in [' so [ contains at least m points. 

Part 4 follows from part 1 and Lemma 2.3. 

Part 5 follows from 








ne 











DENSE SUBGRAPHS AND CONNECTIVITY 267 


LeMMA 3.5. An articulator of T disconnects G. 


In proving this we do not assume that G is m-connected. 

Let I” be an articulator of [T and suppose G — I” is connected. Every 
point in I” has a neighbour in T — I” and so in G — I”. Therefore G — I” 
is dense and so contains a D-minimal subgraph G’. But G — I” — (G — YP) 
= I — I” is not dense. This implies that G — T contains some point of G’ 
contrary to the definition of I. It follows that I’ disconnects G. 

Now suppose G is m-connected and I” contains m points. By this lemma 
and Theorem 3.3, I’ contains either an articulator or an isthmus. But it 
cannot contain either properly by part 2. Since I’ is not completely connected 
it is an articulator (of G). 

If G is completely connected then » = m. Otherwise there are points p 
and g in G which are not connected so G — p — q disconnects G. It follows 
that m < n — 2 and part 6 is proved. 

If p is a point in G — IT and qg is a point in G(p) then if g is not D-minimal 
it has at least m neighbours in T as well as at least one (that is p) in G — I. 
If g is D-minimal then it has m — 1 neighbours. Since G — [ has a point G 
cannot be completely connected so » — 1 > m and part 7 is proved. 

As a partial converse of part 4 we prove 


THEOREM 3.6. If the intersection of any m > 2 D-maximal subgraphs is con- 
nected then there are no k-isthmuses or k-articulators for m > k > 2. 


Let G’ be a k-isthmus or k-articulator with m > k > 2, and let p be an 
arbitrary point in G’. By Theorem 3.1 we know that G — p is dense and so 
D-maximal. Thus G — G’ is the intersection of k < m D-maximal subgraphs 
but is not connected contrary to the hypothesis of the theorem. 

We complete this section with a few isolated results. 


THEOREM 3.7. If G is not completely connected but G(p) is for some point p 
then p is in G — YL. 


Suppose I” is a D-minimal subgraph containing p. If I’ = p then 
G(p) = G — p so G is completely connected contrary to hypothesis. There- 
foré I’ contains a neighbour g of p. Since every point which is a neighbour 
of » is a neighbour of g we see that I’ — p is dense, again contrary to hypo- 
thesis. Thus ~ is not in any D-minimal subgraph. 


THEOREM 3.8. The intersection of all dense subgraphs is exactly the subgraph 
of 1-isthmuses. 


Suppose ? is a point contained in all dense subgraphs. Then G — p does not 
contain a dense subgraph and so is not dense. Since » has a neighbour in 
G — p the latter is not connected so p is a 1-isthmus. Conversely, if p is a 
l-isthmus G — p is not dense but G is so that every dense subgraph contains 


p. 











268 R. E. NETTLETON, K. GOLDBERG, AND M. S. GREEN 


Added in proof: In order to complete the statements of Theorems 3.1 and 
3.3 we should have proved that if G is not completely connected it contains 
@ disconnecting subgraph. This is trivial. For suppose G is not completely 
connected. Then it contains at least three points, at least two of which are 
not neighbours. Then G — p — q disconnects G. 


The authors are indebted to A. J. Hoffman of the General Electric Company 
for his many helpul suggestions. 


REFERENCES 


1. F. Harary and R. Z. Norman, The dissimilarity characteristic of Husimi trees, Ann. Math. 
58 (1953), pp. 134-41. 
2. D. Kénig, Theorie der endlichen und unendlichen Graphen (Leipzig, 1936). 


National Bureau of Standards 
Washington, D.C. 





Pn. Gh spe mh S&S of 


a = a = 


— 


nn OFf,lC el Ur lc (i CO 





THE TERM AND STOCHASTIC RANKS OF A MATRIX 


A. L. DULMAGE anp N. S. MENDELSOHN 


1. Introduction. The term rank p of a matrix is the order of the largest 
minor which has a non-zero term in the expansion of its determinant. In a 
recent paper (1), the authors made the following conjecture. If S is the sum 
of all the entries in a square matrix of non-negative real numbers and if M 
is the maximum row or column sum, then the term rank p of the matrix is 
greater than or equal to the least integer which is greater than or equal to 
S/M. A generalization of this conjecture is proved in § 2. 

The term doubly stochastic has been used to describe a matrix of non- 
negative entries in which the row and column sums are all equal to one. In 
this paper, by a doubly stochastic matrix, the authors mean a matrix of 
non-negative entries in which the row and column sums are all equal to the 
same real number 7. If an m X m matrix A is embedded by the addition to 
A of r rows and columns in an ( + 7) X (n + 17) matrix B with row and 
column sums equal to 7, we say that B is an (r, T) doubly stochastic (abbreviated 
as (r, T) d.s.) extension of A. In (1), the authors made use of a d.s. extension 
of a matrix A to obtain an estimate of the term rank of A. In this paper, the 
authors describe all such extensions and give a necessary and sufficient con- 
dition that a matrix B be a vertex matrix of the convex set of all (r, 7) d.s. 
extensions of A. 

For a square matrix of non-negative entries, the concept of stoch~stic rank 
is introduced. Some results concerning this rank are obtained and the con- 
nection betiveen it and term rank is noted. 

In the final section, the problem of finding all d.s. extensions of a matrix A 
is formulated as a linear programming problem. 


2. A lower bound for term rank. Let J and J be arbitrary sets and 
let f(z, 7), 4 € I,7 € J, bea real-valued non-negative function on J X J which 
is not identically zero. The concept of term rank can be extended to such a 
function f(z, 7) as follows. A finite set of pairs (4,71), (t2, j2),..., (a+, jr) is 
disjoint if i, = i, only if p = g and if j, = j, only if p = g. A function f(i, 7) 
has term rank p if, and only if, there exists a disjoint set of pairs (4, j:), 
(i2, j2),- ++» (4p,fp) such that f(i,,j7,) > 0 for r = 1,2,...,p but for any 
disjoint set consisting of p + 1 pairs, f(i, 7) = 0 for at least one pair of the 
set. If no such maximal-disjoint set exists, the term rank is infinite. 

Let o@ be the collection of finite subsets of J and + the collection of finite 
subsets of J. In this setting, we have the following theorem. 


Received May 8, 1958. 
269 











270 A. L. DULMAGE AND N. S. MENDELSOHN 


THEOREM 1. Jf f(i, 7) satisfies the conditions 


(i) R, = sup| > fii) | is finite for alli € I 
Ber {eB 

and 

(ii) C; = sup| >. fii) is finite for allj € J 


then either the term rank p of the function f(t, j) is infinite or 


S= sup| > os a fi, | and M = sup{R,, C;] 


Ate 1€A Jes 
Ber r+ 


are finite and p is greater than or equal to the least integer which is greater than 
or equal to S/M. 


Proof. Let K be the graph of which the edges are the pairs (7, j) for which 
f(t, 7) > 0. The vertex sets of this (bipartite) graph K are I and J. If p is 
finite, the exterior dimension (see (3)) of K is equal to p. If [P, Q] is a minimal 
exterior pair for K and if U, V is any pair of finite subsets of J and J, then, 
since f(i, j) = 0 for i € P and j € Q, we have 


y YIAD= 2D TIED 


?U jeV teUNP 
* bas tl OD + te eal D 
<2 EJGN+ DB IGI 
<LR+ LC, 
ieP se 


which is finite and independent of U and V. 
Now 


S=sup>> ¥ fli,p< > R,+ 2c 


Ue it fev 
Ver 


so that S is finite. Further 


R, = sup| , ii) 


for all i. Similarly C, < S for all 7. Thus, 
M = sup[R,,C;] < S_ sothat M is finite. 
fer 
jet 


Now, let ¢ be the unique integer such that ¢ — 1 < S/M < t. We must 
show p > t. If p < ¢ then, since p is integral, we have p < ¢ — 1. If [P, Q] is 


-——_—_— —- -—-—_~-- -— ww 


~ as 








TERM AND STOCHASTIC RANKS 271 


a minimal exterior pair for K and »(P) denotes the number of elements in P 
then p = »(P) + »(Q) (3, Theorem 2). It follows that 


pM = (»o(P) + o(Q))M > p> Ri + » Cy> 5. 


Thus S/M < p < ¢ — 1, a contradiction. 

If the sets J and J in Theorem 1 are finite sets of orders m and m, p becomes 
the term rank of an m X m matrix a,,; in which a,, = f(i, 7), M becomes the 
maximum row or column sum and S is the sum of all the entries in the matrix. 
If, in addition, » = m, Theorem 1 reduces to the conjecture in (1) referred 
to in the Introduction. 


3. The stochastic rank of a matrix. Let A be an m X mn matrix with 
non-negative entries a,,. If M is the maximum row or column sum in A, 
then, for every T > M, and for every integer r > n, there exists a matrix B 
which is an (r, JT) d.s. extension of A. In fact, if 


R,= > ay 


j=l 
and 
C,= yi aj 
for 1,7 = 1,2,...,m, the matrix B = (6,,;) may be defined as follows 
bij = ay, for ign, j <n, 
bi; = Ben+i—js 2n+1—1, for n+ 1 <1 < 2n, n+ 1 <j < 2n. 
b,, =90 fori << n,n+1 <j < 2n provided i + j # 2n + 1, 
=T-—R;; fori+j=2n+1,i1<n, 
b;,=0 for n+1<4< 2n, j < m provided 1+ 7 ¥ 2n+ 1, 
=T-C;, fori +j=2n+1,j<n, 
b,,=0 for2n+1<ign+r,or2n+ 1 <j <n +1, provided i¥j 
= 7 for 2n+1<q1=jqn+r. 


The question naturally arises, for what r < m — 1 and T > M isan (r, T) 
d.s. extension of A possible? In Theorem 2, we have a complete answer to 
this question. Its proof will make use of the following lemma. 


LemMaA 1. Let B be an m X m doubly stochastic matrix with row and column 
sums equal to T. Let A be a u X v submatrix of B and let B be partitioned into 
submatrices A, Ay, Az, As as in Figure 1. If S is the sum of all the elements 
in A and S» is the sum of all the elements in Ax, then 


S—S.= (u+ov—m)T. 


Proof. Let S,; and S; be the sums of the elements in A; and A; respectively. 
We have 











272 A. L. DULMAGE AND N. S. MENDELSOHN 


S aa Si = uT 
A + Ss; = ol 
S+ Si + S2+ S; = mT, 
from which the result follows. 
THEOREM 2. Let A = (a;;) be an nm X n matrix of non-negative real numbers 


and let M be the maximum row or column sum and S the sum of all the entries. 
If r <n — 1, the necessary and sufficient condition that there exist a matrix B 
which is an (r, T) d.s. extension of A is that M < T < S/(n —1). 


Proof. If B is an (r, T) d.s. extension of A, we apply Lemma 1 to B. We 

have S — S: = (n — 1r)T. Since 0 < S:, it follows that 
M<T= S— S: <- Ss 
n—r?r n—? 

Clearly, T = S/(m — r) if, and only if, Ss = 0 and T = M if and only if 
S: = S - (nm — r)M. To show the possibility of such extreme d.s. extensions 
we construct the appropriate matrices. We first construct a matrix C = (¢;;) 
which is an (7, S/(m — r)) d.s. extension of A. Let 








Ciy = Az fori<gn,j<n, 

—=__R, fort Cn,n+1<j<qn+r, 
—— 

S 

a9 
ae forn+1<icgqut+rj<cn, 
Ci = 0 forn+1<icn+ren+1< jE n+r. 


We next construct a matrix D = (d,,) which is an (r, M) d.s. extension of 


A. Let 





diy = Ay; fori <n,j <n, 
dy, = M—*: fori<nn+1<j<nty, 
d= 4—& forn+1<icntrj<n, 
4, = SM) forn+1<i<n+randn+1<j<qn-+r. 


rT 


For any T, M < T < S/(n — 1), let p be the unique real number 0 < p < 1 
defined by pS/(m — r) + (1 — p)M = T. The matrix B = pC + (1 — p)D 
is an (r, T) d.s. extension of A. 

We now define stochastic rank. An n X n matrix A with non-negative 
entries has stochastic rank o if A can be embedded in a d.s. matrix B formed 





—-—.- + = S © .) 6 


~ we = -« = PCa 


~~ aon Be ce ar 








TERM AND STOCHASTIC RANKS 273 


from A by the addition of  — o rows and columns and if A cannot be em- 
bedded in a d.s. matrix B by the addition of fewer than m — ¢ rows and 
columns. By Theorem 2, the least r for which A can be embedded in an 
(n +r) X (m + r)d.s. matrix B is the minimum r for which M/S < 1/(n — r). 
This minimum r is n — [S/M]. It follows that ¢ = [S/M]. 

An n X n sub-permutation matrix of rank r is an m X m matrix consisting 
of r ones, no two of which are in the same row or column, and n* — r zeros. 
The convex hull of the sub-permutation matrices P,“ of rank r consists of 
all matrices A expressible in the form A = }>,A,P,;” where }>,, = 1 and 
\, > 0 for all k. The convex polyhedral cone generated by the sub-permutation 
matrices P,‘” of rank r consists of all matrices A expressible in the form 
A = Dem P,” where u > 0 for all k. In (2), the authors showed that the 
necessary and sufficient condition that a matrix A of non-negative entries 
is in the convex hull of sub-permutation matrices of rank » —r is that 
S=n-—rand M <1. A simple restatement of this theorem is that the 
necessary and sufficient condition that a matrix of non-negative entries is in 
the convex polyhedral cone generated by the sub-permutation matrices of 
rank n —r is that M/S < 1/(m — 1). Hence, the maximum rank n —r 
satisfying this inequality is [S/M] and this is equal to the stochastic rank ¢ 
of A. Thus, we have the following corollary to Theorem 2. 


CoROLLARY. The stochastic rank of an n XK n matrix A of non-negative entries 
is o if, and only if, A is in the convex polyhedral cone of the n X m sub-permutation 
matrices of rank o but is not in the convex polyhedral cone of the n X n sub- 
permutation matrices of rank o + 1. 


4. Vertices of a set of doubly stochastic extensions. If we consider 
each (r, JT) d.s. extension of a matrix A as a point in a space of dimension 
(m + r)*, it is apparent the set a of all such matrices is convex. An extreme 
or vertex matrix for the convex set a is an (r, 7) d.s. extension of A which 
is not expressible in the form pC + (1 — p)D in which C and D belong to 
a,C#Dand0O<p<l. 

We may define the bipartite graph K, of an m X m matrix A of non- 
negative entries to be the graph in which the vertex sets are the set of indices 
of the m rows and m columns and the edges are the places of the matrix in 
which the entries are positive. A graph is disjoint if no two of its edges have 
a vertex in common. A graph K, is a subgraph of K, if every edge of K;, is 
an edge of Ke. 

A cycle in a bipartite graph K is a finite subgraph K' with the following 
properties. Let J and J be the vertex sets. If (i, 7:) is any edge of K' then 
there exists exactly one vertex i2 € J, i2 ¥ 11, such that (%, 7:1) is an edge 
of K', and there exists exactly one vertex je € J, je # ji, such that (¢., je) 
is an edge of K', and there exists exactly one vertex i; € I, i; ¥ i2, such that 
(ts, je) is an edge of K', etc. If after 2k — 1 such steps, k > 2, we find that 











274 A. L. DULMAGE AND N. S. MENDELSOHN 


(ix, jx), (a2, jr), (te, J2),--~+ > (tes Fu), (41, je) are distinct and are exactly the 
edges of K', then K"' is a cycle. It follows that for a cycle K' in the bipartite 
graph of a matrix, there exists no row or column which contains exactly one 
edge of the cycle. 
In (1) the core of an R and C marking of an incidence matrix consists of 
the union of a number of cycles no two of which have an edge in common. 
In Theorem 3 we require the following lemma. 


LEMMA 2. For a bipartite graph K, a necessary and sufficient condition that 
there exist a subgraph of K which is a cycle is that there exist a finite subgraph 
of K in which no vertex of either vertex set is edge connected to exactly one vertex 
of the other vertex set. 


Proof. The necessity is immediate. 


To establish the sufficiency, we show that any finite subgraph K' of K in 
which no vertex of either vertex set is edge connected to exactly one vertex 
of the other, contains a subgraph which is a cycle of K. Let the vertex sets 
of K be I and J. If (4, 7:) is an edge of K', i, € I and j,; € J, there exists 
tg # 4, such that (2, j:) is an edge of K'. Similarly, there exists 7. # j, such 
that (i2, 72) is an edge of K'. Continuing this process, since K' is a finite 
graph, it follows that in the sequence (4, j:), (¢2, j1), (t2, je, )..., there must 
exist a first edge E; in which either the i is identical with the i of a previous 
edge or the j is identical with the j of some previous edge. In either case, let 
this previous edge be Eo. The sequence of edges beginning with Ey and ending 
with £; is a cycle. 

















Vv 
—_—_— 
} 
u “A A, 
B= >m 
A; As 
m 
FiGuReE 1 


Now, consider any. (r, J) d.s. extension B of a matrix A of non-negative 
elements and let B be partitioned into submatrices A, A;, A2, and A; as in 
Figure 1. Let 

isi Biss Rane 











TERM AND STOCHASTIC RANKS 275 


be the bipartite graphs of A:, Az, and A; and let L, be the union of 
Kai, Kas Kas, 


so that K, is the union of K, and Lz. We are now in a position to state the 
main theorem of this section. 


THEOREM 3. Let a be the convex set of all (r, T) d.s. extensions of a matrix A. 
A necessary and sufficient condition that a matrix B € a be a vertex matrix of 
the convex set a is that no subgraph of Lz is a cycle. 


Proof. lf a subgraph Lz' of Lz is a cycle, let the edges of the cycle be 
(41, 1), (42, jx), Oe! eee (tes Je), (41, Je)- 

Let « = 4 min b,, taken over all edges (i, 7) of Lz’. Now, if C = (c,,) is 
defined by 


Cry = by; if (2,7) is not an edge of Lz’, 

Cy = by te if (4,7) is (¢1, ji), (42, J2),..., Or (te, je), 

Cy = by —e if (4,7) is (ta, j1), (ts, j2),~ ~~» (ary je), 
and if D = (d,,) is defined by 

diy = by; if (7,7) is not an edge of Lz! 

= by —.« if (4,7) is (t1, j1), (ta, j2),..., Or (te, fr) 

=byte if (4,7) is (t2, jr), (da, je), ~~~, (ir, je), 


clearly C and D belong to a. Since B = $C + 4D, B is not a vertex matrix 
of the set a. 

We now show that if B and C are (r, TJ) d.s. extensions of A such that 
B # Cand Kg = K, then Lg (= Le) has a subgraph Lz' which is a cycle. 
Indeed, let Ly* be the subgraph consisting of the edges (7, j) at which 0 < by, 
0 < cy, and cy; # bys. Since by, = cy, for all (7,7) in K,, Lg* is a subgraph 
of Ls and since B # C, L,* has at least one edge. Since the matrices B and 
C are doubly stochastic with row and column sums equal to 7, they cannot 
differ at exactly one place in a row or column. By Lemma 2, Ls (in fact 
L,*) contains a subgraph L,' which is a cycle. 

Next, suppose that B € a is not a vertex, so that B is expressible in the 
form B = pC + (1 — p)D where 0 < p < 1, C# D and C and D belong 
to a. Now L¢ and Lp are subgraphs of Lg, but we cannot say Le = Lp = Lz, 
for we might have c,, = 0, b,,; #0, and d,, #0. However, ifg# p,0<q< 1, 
then E = gC + (1 — g)D belongs to a, B # E and Kz = Kg. Hence Lz, 


contains a subgraph L,' which is a cycle. This completes the proof of 
Theorem 3. 


CorOLLary. Let a be the convex set of all (r, T) d.s. extensions of a matrix A. 
A necessary and sufficient condition that a matrix B € a be a vertex matrix of the 
convex set a, is that there exist no matrix C € a such that B # Cand Kz = Ke. 








276 A. L. DULMAGE AND N. S. MENDELSOHN 


Lemma 3. If P is an r X r matrix of non-negative elements with at least two 
non-zero elemenis in every row then the bipartite graph K p contains a subgraph 
K p' which is a cycle. 


Proof. Delete from P all the columns containing no non-zero elements and 
let the deleted r X s matrix (s < r) be Q. If there are at least two non-zero 
elements in every column of Q then the required cycle exists by Lemma 2. If 
a column contains exactly one non-zero element q;,, delete the ith row and 
jth column of Q and denote the deleted matrix by Q;. Continue this process. 
If we find Q, such that every column of Q, contains 2 non-zero elements, the 
cycle exists by Lemma 2. If no such Q, exists for ¢ = 1,2,...,s5 — 3, then, 
Q,-2 is an (ry — s+ 2) X 2 matrix with two columns and with two non-zero 
elements in every row. Since r — s+ 2 > 2 the graph of Q;-2 contains a 
cycle. 

Let B be an (r, T) d.s. extension of A. Let the rows and columns of B be 
rearranged as in Figure 1. If in the ith row of B (¢ = 1,2,...,m) there is at 
most one j > m such that the element 5,, > 0 then the ith row of A is simply 
extended. Similarly, if in the jth column of B there is at most one i > m such 
that b,, > 0, then the jth column of A is simply extended. 


THEOREM 4. If B is a vertex of the convex set a of all (r, T) d.s. extensions of 
A, then at least n — r + 1 of the rows and at least n — r + 1 of the columns 
of A are simply extended. 


Proof. Suppose that r rows of B are not simply extended. In each of these 
rows we have at least two elements 


bin, > 0 and Diss > 0, ji > Nn, je > n, ji je 


Thus the m X r matrix A, (see Figure 1) contains an r X r submatrix A, in 
every row of which there are two non-zero elements. 
Hence, by Lemma 3, the graph of 


Ka, 
contains a subgraph which is a cycle and, by Theorem 3, B is not a vertex of a. 
The proof when r columns of A are not simply extended is similar. 


5. The connection between term rank and stochastic rank. Since p 
is greater than or equal to p the least integer which is greater than or equal 
to S/M and since o = [S/M], we have the following result. If S/M is an 
integer, p > o, and if S/M is not an integer, p > o + 1. 

For an m X n doubly stochastic matrix, p = ¢ = mn, and for a sub-permu- 
tation matrix of rank r, p = o = r. However, there are m X nm matrices for 
which p—o =m -—1. In fact, the matrix A = (a;;) in which ay, = n, 
G22 = 033 = ... = Onn = 1, Gey = O for i #7 is such a matrix. We have 
S/M = 2 —1/n. Thus o = 1 and p = n. 





i 


al 





le 











TERM AND STOCHASTIC RANKS 277 


For a matrix of zeros and ones, Ryser (4;5) has considered the transformation 
which replaces a minor 


Gi) » @ 9). 


The effect of this transformation is that the term rank varies between limits 
which Ryser finds. It is interesting to note that the stochastic rank of a 
matrix of zeros and ones is invariant under Ryser’s transformation. 

If M < S/(nm — 1), then, since p > ¢ > nm —r there exist integers ¢ such 
that p >t > —r. We have the following theorem. 


THEOREM 5. Let A be an n Xn matrix of non-negative elements. If 
M < S/(m —r) and M < T < S/(n — 1) and if K,' is any disjoint subgraph 
of K4 consisting of t edges (p >t > m —1r) then there exists an (r,T) d.s. 
extension B of A with the property that the graph Ke» contains a disjoint 
subgraph Ky‘ consisting of n +r edges such that the edges common to Kz’ 
and K, are exactly those of K 4}. 


Proof. If we select any p such that 0 < p < 1, then since M < S/(n — 1), 
the matrix B = pC + (1 — p)D of Theorem 2 is an (r, T) d.s. extension of 
A in which every element of A;, A2, and A; (Figure 1) is positive. Thus all 
the places of A;, As, and A; are edges of Ky. Rearrange the rows and columns 
of B so that K,' consists of the places (1, 1) (2, 2) (3, 3)... (¢, 8). Now con- 
sider the disjoint graph L which has as its edges the places (i, j) of B defined 
by i+j=n+t+r+1. Since t>n—r, we have i+j > 2n+ 1 and 
hence every edge in L is a place in A;, Az, and A; and L is a subgraph of Kz. 
The number of edges in L is n + r — t. For an edge (i,j) of L we cannot 
have i < t, for this would imply 7 > »+7-+ 1 and similarly we cannot 
have j < t. Thus the edges of L and K,' have no vertices in common. Clearly, 
the graph K,' defined as the union of L and K,' is the required disjoint 
subgraph of Kz. 

Let K be a bipartite graph whose edges are a set of places in an m Xn 
array and let A be a matrix formed by putting positive entries in the places 
of K and zeros elsewhere. For a given graph K, the term rank p of all such 
matrices A is the same and is equal to the exterior dimension (3) of the 
graph. Thus, term rank is really a graphical concept. On the other hand, for 
a given graph K, the stochastic rank o of such matrices A will vary between 
1 and an attainable maximum which we denote by ox. We now show that 
ox < p < ox + 1. The inequality on the left is a consequence of Theorems 
1 and 2. To establish the inequality on the right, consider the matrix A formed 
by placing 1 in each of the p places of a maximal disjoint subgraph of K and 
¢ in the other places of K. If a is the maximum number of places of K in any 
row or column of the m X mn array and if } is the number of places in K, then 


S_ p+ (b— pe 











278 A. L. DULMAGE AND N. S. MENDELSOHN 


If a = 1, then 6 = pand ox = p. In other cases, ¢ can be chosen small enough 
that o = [(S/M] > p-—1. Hence, ox >o >p—1 or p<oxg+1. The 
inequality ox < p < ¢x + 1 is best possible in the sense that there exist 
graphs K for which cg = p and others for which p = ox + 1. The graph K 
consisting of the places on a main diagonal in an m X m array is a graph in 
which ¢x = p. The graph K consisting of 3 of the 4 places in a-2 X 2 array 
is such that p = 2. But any matrix A with non-zero elements in the places 
of K and a zero in the fourth place of the array lies in the convex polyhedral 
cone of sub-permutation matrices of rank 1 and does not lie in the convex 
polyhedral cone of sub-permutation matrices of rank 2. Hence og = 1. Con- 
sider a graph K for which the maximum gg is attained in a matrix A in which 
S/m is non integral. We have p < ox + land p > og + I sothat p = og + 1. 
The result just proved may be reformulated as the following theorem. 


THEOREM 6. Let K be a bipartite graph whose edges are the places inann X n 
array. Let a be ihe set of all matrices A with positive entries in the places of K 
and zeros elsewhere. Let ox be the maximum stochastic rank attainable by a matrix 
of the set a. Then every matrix A of a has the same term rank p. Furthermore, 
if S, and M, represent the entry sum and maximal row or column sum of A 


respectively then 
aoe (-S-.) 
dn 9) a 


Also if this supremum is attained by some matrix A, then p = ox, otherwise 
p=ont+l. 


6. Linear programming formulation. Some of the theorems con- 
cerning (r, JT) d.s. extensions of an m X m matrix A may be reformulated as 
problems in the language of linear programming. In these reformulations 
the restrictions on A to non-negative entries may be relaxed somewhat. The 
only requirement is that A satisfy the condition S > (n — r)M > 0. Two 
such formulations follow. 


PROBLEM 1. Let A be an mn X n matrix having S > (n — r)M > 0, and let 


T be any number. Find a set of numbers x;, (i = 1,2,...,n +737 =1,2,..., 

n + r; at least one of i and j is greater than n), subject to the following conditions. 

(1) Xi > 0 for all 2, 7. 
n n+r 

(2) > ay+ > Xy=T fori = 1, 2, - 
j=1 j=n+1 
n+r 

(3) > xy =T fori=n+i1,2+2,...,n+4.7. 
j=l 

(4) DY ay+ D xy=T forj = 1,2,...,m. 
i=1 t=n+1 


(5) > xy =T forj=n+i,n+2,...,a+F7. 





no ao 


fe 


V 








TERM AND STOCHASTIC RANKS 279 


Theorem 2 states that the inequalities have solutions if and only if 
M <T < S/(n —1) and exhibits some of these solutions. If now each set 
of values of x;, satisfying (1), (2),..., (5) is considered as a point in a space 
of (m + r)? — n* dimensions, the set of all such points is convex and Theorem 3 
gives a graphical characterization of the vertices of this set. 


PROBLEM 2. Let A be an n X n matrix having S > (n —1r)M > 0. Find a 


set of numbers xi; (¢ = 1,2,...,m +7; 7 = 1,2,...,"+ 17; at least one 

of 1 and j is greater than n), subject to the following conditions: 

(1) Xi >0 for all 4, 7. 
n n+r n+r 

(2) > ay+ D> x= > Xatr. J fort = 1,2,...,%. 
j=l j=n+1 j=l 
nt+r n+r 

(3) 2, Xa @ 2, Mesos fori =n+i1,2+2,...,2+7-—1. 
im I =1 
p: ee a+r 

(4) DL ay+ DL ta = ym Xintr forj = 1,2,...,m. 
t=1 i=—n+1 i=1 
a+r n+r 

(5) 2, X= 2, Xaver forj=n+i1,#+2,...,2+7r-—1. 
t=1 i=1 

The sum 


is to be maximized or minimized. 

In this formulation our theories state that feasible solutions always exist 
for both the maximum and minimum problems. They also exhibit solutions at 
which the maximum and minimum are attained and state that the maximum 
value is S/(m — 1) and the minimum value is M. Our graphical theorems 
characterize the sets of all maximal and minimal solutions. 


REFERENCES 


1. A. L. Dulmage and N. S. Mendelsohn, Some generalizations of the problem of distinct repre- 
sentatives, Can. J. Math., 10 (1958), 230-41. 








2. The convex hull of sub-permutation matrices, Proc. Amer. Math. Soc., 9 (1958), 
253-4. 

3. Coverings of bipartite graphs, Can. J. Math., 10 (1958), 517-34. 

4. H. J. Ryser, Combinatorial properties of matrices of zeros and ones, Can. J. Math., 9 (1957), 


371-7. 
The term rank of a matrix, Can. J. Math., 10 (1957), 57-65. 





University of Manitoba 











DISJOINT TRANSVERSALS OF SUBSETS 
P. J. HIGGINS 


1. Introduction. Let A;, A2,..., A, be a finite collection of subsets (not 
necessarily distinct) of a set A. By a transversal’ of A,A2,...,A, we shall 
mean a set of m distinct elements a, d2,...,a, of A such that, for some 
permutation 7, i2,...,%, of the integers 1,2,...,m, 


a,€A (j = 1,2,...,n). 


More generally, we shall say that the set {a:, d2,...,a@,}, (r < m) isa partial 
transversal of Ai, Ao,...,An of length r if (i) ai, a2,...,@, are distinct ele- 
ments of A and (ii) there exists a set of distinct integers 7;, i2,...,%, such 
that 

a,€ Ay (j= 1,3,....%. 


A well-known theorem of P. Hall (2) states that the sets A, Ao,..., An 
have a transversal (of length m) if, and only if, every k of them contain 
collectively at least k distinct elements (k = 1,2,...,m). A generalization 
of this theorem by Ore (3) states that the sets A:, Ao,..., A, have a partial 
transversal of length r < n if, and only if, every k of them contain collectively 
at least k +r — n distinct elements (n —r+1<k <n). 

In this paper we enquire under what conditions the sets A, Ao2,...,An 
will have m mutually disjoint partial transversals of prescribed lengths 
1, T2,...,%m. As in the two theorems quoted above, the obvious necessary 
conditions are again found to be sufficient. As a special case we deduce a 
theorem of Ryser (4) and Gale (1) concerning the existence of matrices of 
0’s and 1’s with prescribed row sums and column sums. 


2. Notation. Throughout our argument m will denote a fixed positive 


integer (the number of subsets A,), and ri, r2,...,7%m_ will denote positive 
integers not exceeding nm. We shall suppose that r; > r2 >... > fm > 0 and 
think of these integers as a partition [r,] of 7: +71 +... + %. The con- 
jugate partition [r,*] is defined as usual: 
+ ™ ‘ 
(1) r=) 1 (j = 1,2,...,11). 
riad 


It is convenient also to define r;* = 0 if r; < 7 < m, which is in accord with 
(1) if we interpret empty sums as zero. 


Received May 22, 1958. 


1This term, due to P. Hall, is normally used when the sets Ai, A, . . . A, are disjoint, but 
its use in this wider sense is convenient here. 


280 





eee 


ae 






te ad 





ee Ore ar:. 








DISJOINT TRANSVERSALS OF SUBSETS 


We now write 


n 
(2) a= > 7; (k = 1,2,...,n). 
j=un—k+1 
An alternative expression for a, can be obtained as follows. If s and ¢ are 
integers, let Ex(s, t) denote the excess, if any, of s over #, that is, Ex(s, t) 
=s-—tifs >t, and Ex(s,t) = 0 if s <¢t. Then 


(3) a = >, Ex(ry,n — k) (k = 1,2,...,n). 
t=1 


The easiest way to see this is to draw a partition diagram for [r,], that is, an 
m Xn matrix whose ith row has entries 1 in the first r; places and 0 else- 
where. Then r,* is the number of 1’s in the jth column, and a, is the number 
of 1’s in the last k columns. However, the ith row has exactly Ex(r,,n — k) 
1’s in the last k columns, and (3) follows. 


3. Disjoint partial transversals. Suppose that A;, As, ..., A, have dis- 
joint partial transversals (D.P.T.’s) Ri, Re,..., Rm of lengths 7:1, r2,..., %! 
respectively. The elements of R, represent r,; of the A’s. Of these A’s at least 
Ex(r;,m — k) must be included in any collection of k of the A’s. It follows 
that every k of the A’s must contain between them at least Ex(r,, — k) 
distinct elements out of R, and hence at least 


a = p> Ex(r,,n — k) 
fom 


distinct elements altogether, since the R’s are disjoint. Our theorem asserts 
that this necessary condition is also sufficient. 


THEOREM. A necessary and sufficient condition for A, Ax,...,A, to have 
mutually disjoint partial transversals of lengths r1, 72, ..., %m tS that, fork = 1, 
2,...,m, every k of the A’s contain between them at least a, distinct elements, 


where a, is defined by (1) and (2) above. 


We observe here that the case m = 1,7; = r, is precisely Ore’s theorem since 
we then havea, = 0 (l<k<n-—r)anda=k+r—n(n—-r+1<k 
<n). 

The proof of sufficiency proceeds by induction on n. It is trivial when 
m = 1, and from now on we shall assume the result for all collections of 
n’ <n sets and all sets of integers r, < n’. 

We distinguishtwo cases which are mutually exclusive and cover all possi- 
bilities: 

Case 1. m > 2 and 7r— <M, fm—1 <0; 


Case 2.7; = 72 =... @ fn-1 = 2,1 Cte <M. 


We shall first reduce Case 1 to Case 2. 








282 P. J. HIGGINS 


If [r,] is a partition falling under Case 1, we define a new partition [¥;], 
the reduction of [r,], as follows. Let 7; = ro = ... = 7, = N, T441 < m, where, 
by assumption, 0 < t < m—2. Then ?, =f, —1, Fai = %r41 +1, and 
¥, = r, for all other values of i. Clearly 7; > 72 > ... > 7m, and by a finite 
number of such reductions any partition falling under Case 1 can be reduced 
to one falling under Case 2. Note that we may have 7, = 0, in which case 
the value of m is reduced by 1. It will, however, be convenient at times to 
retain such vanishing parts of a partition and interpret a partial transversal 
of length zero as the empty set. This will not affect the proof in any way. 

To reduce Case 1 to Case 2 it is enough to prove 


Lemma. If the theorem is true for the partition |¥,| then it is also true for the 
partition [r,}. 


Proof. Suppose that [r,] falls under Case 1, and every k of the A’s contain 


between them at least a, distinct elements (k = 1,2,...,). 

Case 1 (a). First consider the possibility that for some k (1 < k < m — 1) 
there is a collection of k of the A’s, say A:, Ao, ..., Ax, which contain between 
them precisely a, distinct elements. We construct two new partitions [p,] 
and [q,] where p, = Ex(r,,n — k), q; = min(r,,n — k) (¢ = 1,2,...,m). 
Then Pi: + @: = 7; (6 = 1,2,...,m), pf = Pyar® (f = 1,2,...,8), and 
qs =1r;* (j = 1,2,...,” — k). We apply our induction hypothesis to the 
sets A, Ao,..., Ax with the partition [p,] and to the sets Axyi1, Agyo,..., An 


with the partition [g,]. For this purpose let 8, and y, be the integers obtained 
from [,] and [q,] in the same way that the a, were obtained from [r,]. Thus 


* n 


B= % p= Dd 7; =a, (s = 1,2,...,8), 
jmk—s+1 j=n—s+1 
and 
n—k * n—k * 
1: = bm qy = > 1; = Onte — Oe (s=1,2,...,8#—k). 
j=n—k—s+1 j=n—k—s+1 
By assumption, every s of the sets A:, Ao,..., A, contain between them at 
least a, = 8, distinct elements. Also, any s of the sets Azsi, Agso,..-, An 
contain, together with all of A;, Ao,..., Ax, at least a,,, distinct elements. 
Since A, \U A; ...\U A, contains precisely a, elements, any s of the sets 
Azs1, Anse,---,Ag must contain between them at least a4, — a, = ¥; 
distinct elements not in A, \V A,\...\/A,. It follows that there exist 
D.P.T.’s Pi, Po,...,Pm of Ai, A2,...,Axz of lengths pi, p2,..., Pm, and 
D.P.T.’s Qi, Qo, ..., a Of Acsi, Anse, ..., An Of lengths qi, ge, ... , Gm, none 
of the Q’s having an, elements in common with any of the P’s. The sets 
P,U Qi, P2U Qe,...,Pm\U Qn are then D.P.T.’s of A;,Ao,...,A, Of 
lengths 171, 72,..., Tm- 


Case 1 (b). If no such collection of A’s exists then, for k = 1,2,...,” —1, 
every k of the A’s must contain between them at least a, + 1 distinct elements, 





— Vw TF TF a 


ee — + 


—_ 2 


| 





DISJOINT TRANSVERSALS OF SUBSETS 283 


and we now appeal to the reduced partition. We observe that in passing 
from [r;] to [¥7,] one of the r,* is increased by 1, and one of them is decreased 
by 1, the others being unaltered. Hence, in the obvious notation, 


n n 
&= > %5<14+ DS re =1+a (k=1,2,...,8), 
jun—k+1 jun—k+1 
while &, = a,. Thus every k of the A’s contain between them at least & 
distinct elements (k = 1,2,...,m). Assuming the theorem for the partition 
[7,], we can find D.P.T.’s Ri, Ro,..., Re of lengths 7;, %2,...,%m. Now 
Fis1 > 7141 > Tm > Fm (¢ has the same meaning as before). Hence there must 
be in R,,; at least one element which represents a set A, not represented by 
any element of R,,. If we transfer this element from R,,,; to R,, we obtain 


D.P.T.’s of lengths r:, r2,...,%m. This proves the lemma. 

It remains to prove the theorem in Case 2, that is, under the assumptions 
1 = fe =... =far =n, 1 St, Cn, m>1. Then r,* =m for j = 1, 
2,...,7, and r* =m-—1 for j=r+1,r+2,...,m, where for con- 
venience we write r,, = r. We now make the further definition 

k 
* 
& = > ry >a (k = 1,2,...,m). 
j=l 
Assume that A, Ao,...,A, satisfy the conditions of the theorem. 


Case 2 (a). First suppose that, for k = 1,2,...,” — 1, every k of the A’s 
contain between them at least 6, distinct elements. The same will be true 
for k = n since 6, = a,. Consider a collection of sets {B,} consisting of m 


repetitions of each of the sets A, Ao,..., A, and m — 1 repetitions of each 
of the sets A,,1, A-s2,...,A, (if any). There are a, sets altogether, and we 
shall show that, for s = 1,2,...,a,, any s of these sets contain between them 


at least s distinct elements. We must first count the number k of distinct* 
A's included amongst s of the B’s. Clearly k > s/m; and if s > mr then 
k>r+ (s — mr)/(m — 1). If s < mr, then s/m < rand, if k’ is the smallest 
integer such that k’ > s/m, then k’ < r. Hence 5, = k’m > s, and any s of 
the B’s must contain between them at least 5, > 5, > s distinct elements. On 
the other hand, if s > mr, then k —r > (s — mr)/(m — 1), and & = rm 
+ (k —1r)(m — 1) > rm + (s — mr) = s. Thus again any s of the B's 
must contain between them at least s distinct elements. Applying Hall’s 
theorem quoted in the introduction, we can find a complete transversal of 
the B’s. The a, distinct elements in this transversal comprise m distinct 


representatives of each of the sets Ai, As,...,A, and m — 1 distinct repre- 
sentatives of each of the sets A,1, A;42,...,A,. It is easy to see that these 
elements can be arranged to form D.P.T.’s of A;, Aa,...,. 1,,m—1 of 


length m and one of length r. 


*By “distinct” we mean here “having distinct suffixes.’ Thus distinct A’s may have the 
same members. 








284 P. J. HIGGINS 


Case 2 (b). The alternative to 2 (a) is that for some k (1 < k <n — 1) 
there is a collection of k of the A’s whose union contains fewer than 4, distinct 
elements (but at least a). From all such collections (for all possible values 
of k) we pick one collection consisting of, say, k A’s whose union contains 
a, + u distinct elements with u as small as possible. Thus every s of the A’s 


(s = 1,2,...,m) contain between them at least min(é,,a, + ) distinct 
elements. (This statement for s = n follows from the fact that 5, = a,.) Let 
the chosen sets be A;, Ao,..., Ax (Rk is now fixed, 1 <k <n —1). If u =0 


we may, of course, proceed as in Case 1 (a). This fails, however, if u > 0, 
and we must appeal again to the special form of the partition [r,]. 

Consider the sums of k successive r*’s, that is, the integers €, = rii* + 
Toa +... + rue* (¢ = 0,1,...,2 — k). Clearly & = 6 > €: >... > ena 
=a,. Also €; — €4:1 < 1 since m = 7r;* >1r2* >... >17,* > m—1. Now 
5, > a, + u > a; hence there is an integer ¢ (1 < t < _m —k) such that 
€; = a, + u. We may take ¢ < r since, if ¢, is defined, its value is (m — 1)k 
which must also be the value of a. 

In the partition diagram of [r,] we now look at columns ¢ + 1,#+ 2,..., 
t + k. They form the partition diagram of [p;] where p: = po =... = Pm-i 
=k, bm =r —t, and 


Do Pi =o + u. 


The remaining columns form the partition diagram of [g,] where q. = g2 = - 
= dm—-1 = n — k, dm = t. The integers 8, and y, obtained from [p,] and [a4] 
in the same way that the a, were obtained from [r,] are given by 


A t+k 
B= Dpe= DF; (s = 1,2,...,8), 
j=k—s+1 jm t+k—s+1 
= a = 4 if sgn—k-t 
Te et? lane— (atu) if mn—k-t<s<n-k. 
Consider a collection of s < k of the sets A;, Ao,...,A,. Between them 


they contain at least min(é,,a, + u) distinct elements. Now 


t+k 


6, = D> r;> ~ r; = By. 
— 


j= t+k—s+1 


Also 


| 
- 
~ # 

| 
™ 
~ # 


a,+u= (a + u) — (a, —a,) = 


j=t+l1 jun—k+1 
t+k * t+k—8 
> 2. = 2. Fv (since t < n — k) 
j=t+1 j=t+1 
= B;. 
Thus any s of A;, Ao,..., A, contain between them at least 8, distinct 


elements, and since k < m, we may apply our induction hypothesis to find 
D.P.T.’s P,, Ps, eees Fa of A, As weeagd 1, of lengths Pr, Pe peecese Pm- 





ewer 


—-— Oo © f ££ OO 2 





— 


Sl een ane 





DISJOINT TRANSVERSALS OF SUBSETS 285 


Now consider a collection of s << m —k of the sets Agi, Anyo,..., An 
Together with all of Ai, Ao, ..., A, they contain at least min(d,4,, a,4, + ) 
distinct elements. Since A; \/ A2:\/...\/ Asx contains exactly a+ 4 
elements, the s sets from Ax41, Agy2,..., Aq Must contain between them at 


least min(d,,, — (az + 1%), ag4, — a) distinct elements not already used in 
P;, P2,..., Pm. If we can show that, for s = 1,2,...,” —k, 


(i) Sr+s — (a, + u) > Ys 


and 

(ii) Gey, — Oe > Yo 
we may apply our induction hypothesis to obtain D.P.T.’s Q:, Qz,..., On 
of Axis, Ango,...,An Of lengths qi, g2,...,Qm from elements not already 
used in P;, Po,..., Pm. Ifis > nm — k —t, these inequalities are obvious; for 


then vy, = ani; — (a, + u), and clearly dy4, > axis, a KC a, + u. On the 
other hand, if s <n —k —t, then y, = a,. In this case we observe that 
bers > be +a, > (a + u) + a,, from which (i) follows. Also a,4, > ar + a,, 


from which (ii) follows. This establishes the existence of Q:, Qe, ..., Qn. 
Finally, P;\U Qi, P2U Qe,...,Pm\U Qn are D.P.T.’s of Ay, Ao,..., Ap 
of lengths 71, 72,...,%m, and the theorem is proved. 


The application to matrices of 0’s and 1's, mentioned in the introduction, 
is immediate. Let m > r1 > r2 >... > tm >O and 55 > 52 >... > 5, > 0. 
The insertion of 1’s in an m X n matrix so that there are at least r, 1’s in 


the ith row (¢ = 1, 2,..., m) and not more than s, in the jth column (j = 1, 
2,...,) is equivalent to the construction of D.P.T.’s of lengths 7, re, ... , Tm 
of n disjoint sets containing respectively 5, S2,..., 5, elements. Our theorem 


gives as necessary and sufficient conditions for the existence of such D.P.T.’s 
the inequalities 


De ss > oe = ) » r; (k= 1,2,...,2). 


jan—k+ j=un—k+1 


(The inclusion of zeros amongst the r’s affects neither the hypotheses nor the 
conclusion.) If we require exactly r,; 1’s in the ith row and exactly s, in the 
jth column, we need only add the condition 


n m 
yo ea » Tt 
j=l tel 
These are the conditions found by Ryser and Gale. 


REFERENCES 


1. David Gale, A theorem on flows in networks, Pacific J. Math., 7 (1957), 1073-82. 
2. Philip Hall, On representatives of subsets, J}. Lond. Math. Soc., 10 (1935), 26-30. 
3. Oystein Ore, Graphs and matching theorems, Duke Math. J., 22 (1955), 625-39. 


4. H. J. Ryser, Combinatorial properties of matrices of zeros and ones, Can. J. Math., 9 (1957), 
371-7. 


Harvard University 











SEPARATION AND APPROXIMATION IN 
TOPOLOGICAL VECTOR LATTICES 


SOLOMON LEADER 


1. Introduction. Spectral theory in its lattice-theoretic setting proves 
abstractly that the indicators of measurable sets generate the space L of 
Lebesgue-integrable functions on an interval. We are concerned here with 
abstractions suggested by the fact that indicators of intervals suffice to generate 
L. Our results show that the approximation of arbitrary elements of a topo- 
logical vector lattice rests upon the ability to separate disjoint elements f and 
g by an operation that behaves in the limit like a projection annihilating f and 
leaving g invariant. 

The introduction of this concept of separation together with the notion of 
limit unit leads (via the Fundamental Lemma) to abstract generalizations 
of the Radon-Nikodym Theorem (Theorem 1) and the Stone-Weierstrass 
Theorem (Theorem 3). Even for lattices which have representations as 
function spaces our abstract approach has several advantages: (i) the domain 
plays no explicit role in the theory, (ii) we are not restricted to the topology 
of uniform convergence, and (iii) the functions under consideration need not 
be bounded, although they must be limits of bounded functions. Thus, Theorem 
3 is actually stronger than Stone’s theorem (12). We do not assume con- 
ditional o-completeness (1) in our lattices, so countable-additivity plays no 
role in the Boolean ring of Theorem 1. 

The author is indebted to the referees for clarifying the general setting of 
the theory. 


2. Positive operators on a vector lattice. Let 2 be a vector lattice 
with real scalars. The following lattice-group properties will prove useful 


(1, 4, 9): 


(2.1) f+ee=f[Vetfag 

(2.2) (—-fAgA(e-fAg) =0 
(2.3) fAh—gAh <lf-g 
(2.4) fVh-gvVh <l|f—gl. 


An operator on & is a linear mapping of 2 into itself. The operators on & are 
partially ordered by defining P < Q whenever Pf < Qf for all f > 0 in &. 
Thus, positive operators are order-preserving: 


(2.5) If P >0 and f < g, then Pf < Pg. 


Received June 5, 1958. The author is grateful for the support of the Research Council of 
Rutgers University. 


286 





ae 


ae 














TOPOLOGICAL VECTOR LATTICES 


A contractor is an operator P such that 
(2.6) 0O<P<I 


where J is the identity operator. We shall use the abbreviation P’ for J — P. 
Thus, P is a contractor if, and only if, both P and P’ are positive operators. 
Note that P’ is a contractor whenever P is a contractor, and PQ is a con- 
tractor whenever P and Q are contractors. 

Contractors interest us because they commute with the lattice operations: 


(2.7) P(f Ag) = Pf A Pg 
(2.8) P(fV g) =PfVv Pg 
and 

(2.9) P\f| = |Pfi\. 


To prove (2.7) let h = PfA Pg. Since fA g <f andfA g < g, (2.5) gives 
P(fA g) < Pf and P(fA g) < Pg. Hence P(fA g) < hk. To reverse this 
inequality we have h < Pf and P’(fA g) < P’f. Adding these gives h + P’ 
(fA g) <f. Similarly, h+ P’(fA g)<g. Hence h+P'(fAg) <fAg. 
Transposing the second term on the left gives h < P(fA g). Hence (2.7). 
The dual statement (2.8) follows from (2.7) and the identity (2.1). To obtain 
(2.9) set g = —f in (2.8). 

We call an idempotent contractor a projector. If A and B are projectors 
and f > 0, then 
(2.10) ABf = Af A Bf. 


To derive (2.10) let g = AfA Bf. Now ABS < Bf < f by (2.6). Applying 
A to the latter inequality gives ABf < Af. Hence ABf < g. To reverse this 
inequality note that 0 < g < Af and 0 < g < Bf. Since A* = A, A’A = 0, 
so A’g = Oby (2.5). Thus Ag = gand similarly Bg = g. Hence ABg = Ag = g. 
Since Af < f and Bf < f, g <f. So ABg < ABf. That is, g < ABf. Hence 
(2.10). 

From (2.10) it follows that projectors commute: AB = BA. Moreover, in 
terms of the operator ordering, (2.10) gives A (\ B = AB and hence A U B 
= A + B — AB, which are easily seen to be projectors. Thus, the projectors 
on ¥ form a Boolean algebra with J as unit. 

We remark that if 2 is non-Archimedean, contractors need not commute. 


3. Topological vector lattices. L is a topological vector lattice if it is a 
vector lattice with a topology making it a topological vector space possessing 
a local base of neighbourhoods ® of 0 such that 


(3.1) f is in N whenever |f| < |g| for some g in MN. 


(In (10) 2 is called a locally-solid lattice-ordered linear topological space.) 
The lattice operations as well as the vector operations are continuous in &. 
Every Banach lattice (1) is clearly a topological vector lattice. 











288 SOLOMON LEADER 


Given an arbitrary set U of elements in a topological vector space B, we 
say U generates W if W is the smallest closed linear subspace of B which 
contains Ul. 

A positive element u in a topological vector lattice is a limit bound of f if 


(3.2) fiA nu— \f asn— @, 


f is bounded relative to u if |f| < nu for some n. Now, u is a limit bound of 
f if, and only if, f is a limit of elements bounded relative to u. For, given (3.2) 
and |h| < |f, we have, using (2.3), 0 < [f/A mu —hA nu < |f| — h. Hence 
O<h—hA nu < lf| —|f A nu. From (3.2) and (3.1) we have hA nu — h. 
Taking first h=f*+ and then A=f~ gives ftA mu —f-A nu-—f as 
n—» ©. Conversely, given a net (8) of bounded elements converging to f, 
f:—f, we have |f,| = |f,| A mu for n sufficiently large. So using (2.3), 


0<|f| —|flA nu < |If| — lel] + [fd — LALA mul < 2|if| — [fll < aif — Sf. 


Hence (3.2) follows from (3.1) 

We say u is a limit unit in & ii u is a limit bound for every f in &, that is, 
if the bounded elements relative to u are dense in &. A limit unit is always a 
weak unit (1) if the topology in & is 7, that is, if finite sets are closed. To 
prove this let f A u = 0. Then we have 1/n (f A nu) < f and 1/n (f A nu) 
< u. So fA nu = 0. Hence (3.2) implies f = 0. We remark that a weak unit 
need not be a limit unit. 

A set € of operators on a topological vector lattice % is said to separate 
f from g if for every neighbourhood ® of 0 in & there exists P in € such 
that both f — Pf and Pg are in &, that is, if there exists a net P, in € such 
that Pf —f and Pg — 0. We say € separates f and g if it separates f from 
g and g from f. 


4. Approximation by contractors on a limit unit. Our approxima- 
tion theorems all depend upon the following lemma: 


FUNDAMENTAL LEMMA. Let u be a limit unit in a topological vector lattice & 
and © a set of contractors on 2 such that © separates every pair f and g in & 
for which f A g = 0. Then the set of all PQ’u with P and Q in © generates &. 


Proof. Since u is a limit unit we need only show that for |f| < Au and & 
any neighbourhood of 0 satisfying (3.1) there exists g of the form }>A,P,Q,’u 
with P, and Q, in @ such that f — g is in &. 

Consider an arbitrary « > 0. We may assume « is small enough to ensure 
that eu is interior to N, using the continuity of scalar multiplication. Choose 
Ao, A1,--- » Aw with Ay — Ay_s = € for Rk = 1,..., N and Agu < f < Ayu. For 
notational simplicity let f, = f — \,u. By the hypothesis of separation there 
exists for each k a net P,(¢) in € such that 


(4.1) Pft+0 and Pif;,—0, 








:_—-s- 




























TOPOLOGICAL VECTOR LATTICES 289 


the limits being taken with respect to ¢. (We hereafter abbreviate P(t) to P.) 


Since fo~ = 0 we may assume P» = 0. Also, since fy+ = 0 we may take the 
net Py such that 
(4.2) Pyu-u 


applying the separation hypothesis to u and 0. Now, 
(4.3) 0 < ePeiPiw = PresPi(fe-r — fe) < PeaiPi(|feal + fel) 
< Prsft-i + Pift-1 + Praft + Pife < 2Praft-s + 2Pift 


since fy.” < f,~ and f,* < f,_1*. Since the right side of (4.3) converges to 0 
by (4.1), we have via (3.1) 


(4.4) P,.Pu— 0. 

Since P,Py-1' = (Py — Pe-1) + Pr-sP,’ and Py = 0, 

(4.5) D PePi-1 = Pwt+ D> PrP 

with summation over k = 1,..., N. Applying (4.5) to u and taking limits 
with respect to ¢, we obtain via (4.4) and (4.2) 

(4.6) > PrPi-w > u. 


Recalling that |f| < Au and P,_,P,’ is a contractor, we have 
\Pr-iP,'f| < AP -iP,'u 
by (2.5) and (2.9). Hence (4.4) gives 


(4.7) P,yPif > 0. 
Similarly, since Py’u — 0 by (4.2), Py'f — 0. So (4.5) and (4.7) give 
(4.8) > PrPiaf of. 


Now since f,x~ < fx-1” + eu, 
(4.9) |PrPtsfel < Paff + PrPisfs < Peft + Phsfes + PpPi_m. 
Thus, 
(4.10) |f— D MPrPim| < |f- DO PrPiaf| +| OD PrePiah 
<|f- XO PiPiaf| +O Pat 
+ > Pisferte D> PrPi_wm. 


By (4.8), (4.1), and (4.6) the right side of (4.10) converges to eu, which is 
interior to N. Hence, the right side of (4.10) is eventually in N. By (3.1), the 
left side of (4.10) is likewise eventually in M, which proves the lemma. 





5. Approximation by projectors on a limit unit. 


THEOREM 1. Let R be a Boolean ring of projectors on a topological vector 
lattice and u be a limit unit in . Then Ru, the set of all Eu for E in ®R, gener- 
ates if, and only if, R separates every pair f and g in & for which f A g = 0. 











290 SOLOMON LEADER 


Proof. Let Ru generate L. Then, given fA g = 0, there exists a net f, con- 
verging to f and a corresponding net g, converging to g of the form: 


(5.1) fi= Do wEmu, g:= dD BrEm 


where E, is in R and E,E, = 0 for i # j. Since f, > f, |f,;| — |f| by (3.1). 
Moreover f > 0, so we may assume f, > 0, and similarly g, > 0. That is, 
a, > 0 and f, > 0 in (5.1). Let A, be the sum of those E, in (5.1) for which 
a, < By. Since f;A g: = i:E,u where 6 is the smaller of a, and 6, we 
have 0 < Ad, <f:A g, and 0 < Aig, <f:A gs. Therefore 


(5.2) Af <|Ag -—Add + Ad: 
< fm-fi +f gt 


Since fA g = 0, 
SiN 2:< fA —fA £1) + FA Ri —fiA gt < g — gl + f—-fi 


by (2.3). Hence (5.2) gives |A,f| < |g — g:\ + 2/f —f,|. Since f, -~f and 
g:—g, Af 0 by (3.1). Similarly 


|A ‘g| < lf — fil + 2\g = g:!. 
Hence, A,'g — 0. 


The converse follows directly from the fundamental lemma, since PQ’ is 


in ® for P and Q in ®. 


6. Topological lattice algebras. Let &% be a 7, topological vector lattice 
in which an associative, distributive multiplication is defined making W& a 
topological algebra with a multiplicative unit 1 which is also a limit unit. 
Moreover, let fg > 0 whenever both f > 0 and g > 0. We call & a topological 
lattice algebra. From (2) it follows that multiplication is commutative in &. 

We shall apply the results of the preceding sections by viewing the elements 
of 4% as operators on YW via multiplication. This is effective because the operator 
ordering for elements of & is just the ordering in &. A few simple lemmas 
serve to establish the basic properties of W. 


Lemma 1. If fA g = 0, then fg = 0. 


Proof. Let f, = fA nl and g, = gA nl. Since 1 is a limit unit f, ~f and 
£. — g. Since multiplication is continuous f,g, — fg. Thus, it suffices to show 
faZn = 0. Since 0 < f, < f and 0 < g, < g we have O< f, A gp < fA g. So 
In‘ Sn = 0, since fA g=0. Moreover, 0<f, < ml and since g, > 0, 
0 < faga < mg. Similarly f,g, < nf,. Hence 


0 <= fabs < fn A Bas 


and so f,g, = 0. 











g! 


fr 


al 











i ed 











TOPOLOGICAL VECTOR LATTICES 


LemMA 2. f? = |f|*. Hence, f? > 0. 
Proof. By Lemma 1, f+ f- = 0. Soff = (ft —f-P=fP"4+f?’ = ifi*. 
Lemma 3. If f? = 0, then f = 0. 


Proof. By Lemma 2 we may assume without loss of generality that f > 0. 
Consider any « > 0. Now (f — el)? = — 2¢f + €*1, which is positive by 
Lemma 2. So 2¢ < ¢*1. Dividing by « we get 0 < 2f < el. Letting «0 
gives f = 0. 

Lemma 4. If f > 0, g > 0, and fg = 0, then fA g = 0. 


Proof. Let h = fA g. Then 0 <4 <f and 0 <h <g. Therefore 0 < h? 
< fh < fe < 0. So h? = 0. By Lemma 3, A = 0. 


Lemma 5. |fg| = |f| |gl. 
Proof. fg = (f* —f-)(et —g-) = Utet +fe-) — Ute + f-8*), a differ- 
ence of two positive terms. That the product of these two terms is 0 follows 


from Lemma 1, using the commutative, distributive, and associative laws. 
Hence, by Lemma 4, the two terms are disjoint. Thus, 


Yg)* = fe + fo 
and 
(fe)- = fte + fet. 
Therefore, 
fel = Ge)? + Ve = +H +) = Vi lel. 
Lemma 6. fg = 0 if, and only if, |f| A \g| = 0. 
Proof. By Lemma 5, fg = 0 if, and only if, |f| |g) = 0. By Lemmas | and 
4, |f| |g| = 0 if, and only if, |f|/ A |g) = 0. 
7. Projectors on a topological lattice algebra. 
LEMMA 7. The identity 
(7.1) (Ef)g = f(Eg) = (Ef) (Eg) 
holds for every projector E on XY. 


Proof. (Ef)g — f(Eg) = (Ef)(E’g) — (Eg)(E’f), an identity which can be 
verified by setting E’ = I — E on the right and expanding. We shall show 
that each of the terms on the right side of this identity is 0, in order to derive 
the first equation in (7.1). Now by (2.9), (2.5), and (2.10), 


IEf\ A \E’g| = ElfiA E'lgl < E(ifi + lel) A E’(lfl + lel) = EE’ (fi + |gl) =0. 


Thus, by Lemma 6, (Ef) (E’g) = 0. Similarly (Eg)(E’f) = 0. The second equa- 
tion in (7.1) follows if we replace f in the first equation by Ef. 








292 SOLOMON LEADER 


LemMaA 8. The projectors E on & are isomorphic to the idempotent elements 
e of U via the correspondence E ~ e induced by 


(7.2) El=e 
and 
(7.3) ef = Ef. 


Proof. Given any idempotent e = e? in &, Lemma 2 implies e > 0. Since 
1 — ¢ is also idempotent we have 0 < e < 1. Thus E defined by (7.3) is a 
projector. Conversely, every projector E defines an idempotent e via (7.2) 
which, by Lemma 7, satisfies (7.3). Clearly, I ~ 1 and for A~ a and B~ 8, 
AB ~ ab. 


The next theorem follows directly from Theorem 1 via Lemmas 6 and 8. 


THEOREM 2. Let R be a Boolean ring of idempotents in a topological lattice 
algebra R. Then R generates UA if, and only if, R separates every pair f and g 
in % for which fg = 0. 


8. Subalgebras dense in %. A subalgebra of Y is a linear subspace which 
is closed under multiplication. 


THEOREM 3. Let R be a subalgebra of a topological lattice algebra U. Then R 
is dense in % if, and only if, R separates every pair f and g in % for which 
fg = 0. 

To prove this theorem we need another lemma. 


LemMA 9. The following conditions are equivalent: 
(i) R separates f and g whenever fg = 0. 
(ii) The set of all contractors in the closure of R separates f and g whenever 
fAg=0O. 
Proof. We first show that (i) implies that the closure of ® is a lattice and 
contains the unit 1. Now the trivial identity f — g = (f —fA g) — (g —fAg) 
gives, in view of (2.2), 


(8.1) (f-—gt=f—faAg. 
Thus, to show that the closure of ® is a lattice we need only show that it 
contains f+ whenever it contains f. Since ft/- = 0, (i) implies the ‘existence 


of a net h, in R such that h, f+ — ft and h,f- — 0. Hence hf — ft. Since hf is 
in the closure of ®, so is ft. That 1 is in the closure of R follows from (i), 
since must separate 1 from 0. 

Given fA g = 0, (i) gives a net hk, in R with hf —-0 and hy —g. Let 
P; = \h,| A 1 which is in the closure of R by the preceding arguments. Clearly, 
p, is a net of contractors: 0 < p, < 1. Moreover, since 0 < p, < |h,|,0 < pif 
< |hf| using Lemma 5. So by (3.1), pf—0. From the identity (8.1) we 





teil 





ay 





—- 


eo 











TOPOLOGICAL VECTOR LATTICES 293 


have 1 — p, = (1 — |h,|)*. So (1 — podg <|(1 — lhd)gl < |g — hygl. Hence, 
pw —g. Thus (i) implies (ii). 

Given (ii) and fg = 0, |f| A |g| = 0 by Lemma 6. So there exists a net of 
contractors , in the closure of ® separating |g! from |f|:p,|f|--0 and p,\g|—>\g! 
with 0 < p, < 1. Using Lemma 5 we have p,f — 0 and (1 — p,)g — 0. Since 
p, is in the closure of ® there exists h, in MR such that p, — 4,0. Hence 
lk < h, — Ps if + pif and |(1 — h,)g| < (1 — p:)\g| + bi —h, igi. So 
hf—0 and hg — g, giving (i). 


Proof of Theorem 3. Given (i) we have (ii) by Lemma 9. By the Fundamental 
Lemma, (ii) implies ® is dense in &. Conversely, we shall show that if the 
closure of ® is UM, then (ii), and hence (i) holds. 


Given fA g = 0 let 
Pn = m( A 5 1) 
a é n ; 


We contend that p, is a sequence of contractors separating g from f. Clearly, 
0 <p, < 1. Since 0 < p, < ng, 0 < Daf < nfg. Now fg = 0 by Lemma 6, 
so p,f = 0. 

Noting that 


1 1 
1-p=n(t1—ga41), 


apply (2.2) to 1/m 1 and g to obtain, via Lemma 6, 


G= ba(e - 1»,) = 0. 


1 
(1 -_ Pn)g _ yn Pall - Pn). 
Hence, 
1 
So (1 — p,)g — 0. 


9. Absolutely continuous set functions. Let u be a bounded, non- 
negative, finitely additive measure on a Boolean algebra 8 with unit J. The 
Banach lattice B dealt with in (3) and (6) consists of all finitely additive, 
real valued functions f on 8 which are absolutely continuous with respect 
to u: 


(9.1) f(E)-0 as u(E)-0. 
The norm in & is defined by 
(9.2) IIf|| = sup f(E) — f(E£’) 








294 SOLOMON LEADER 


where the supremum is taken over all E in 8. The partial ordering is induced 
by defining f > 0 whenever f(E) > 0 for all E in B. With this ordering 


(9.3) fA g(A) = inf f(EA) + g(E’A) 
and 
(9.4) f V g(A) = sup /f(ZA) + g(E’A) 
taken over all E in % (1, 3, 4, 6). Since |f| = f V —f, (9.2) and (9.4) give 
(9.5) fl] = If CD. 
Every E in % defines a projector E given by 
(9.6) Ef(A) = f(EA) 


for all A in 8. Thus 8, modulo the ideal of all E with u(£) = 0, is isomorphic 
to a subalgebra of the Boolean algebra of all projectors on &. 

Now (9.1) implies that u is a limit unit. To prove this let f > Oandf, = fA nu. 
The sequence (f — f,)(J) is decreasing, hence converges to some limit A. In 
view of (9.5) we need only show \ = 0. By (9.3), f,(J) = inf f(E’) + nu(Z). 
Hence we may choose a sequence E, such that 


falD) < f(Es) + 2 u(y) < f(D) + 2. 
Multiplying by — 1 and adding f(J) we obtain 
1 
Ff — fa)(1) — 5 <f(En) — nu(En) < Ff — fa). 


Hence f(E,) — n u(E,) converges to A. Now 0 < f(E,) < f(D) and0 <A < f(J) 
while increases without bound. Hence u(E£,) must converge to 0. By (9.1), 


f(E,) does likewise. SoA = — lim 2 u(E,). Thus A < 0. But A > 0. SoA = 0. 
Given fA g = 0 there exists, via (9.3) with A = I, a sequence E, in B 

such that 

(9.7) f(Eq) + g(Ek) > 0. 


By (9.6) and (9.5), |/E,f|| = f(Z£,) and ||E,’g|| = g(E,’). So (9.7) implies 
that % separates f and g. By Theorem 1, Su generates B. That is, the “step 
functions” are dense in B. (See (3) and (6).) As was pointed out by Bochner 
(3), this gives the Radon-Nikodym theorem (11). 


10. The finitely additive integral. Let 8 be a Boolean algebra of sub- 
sets E of a set J with J as unit. Let u be a bounded, non-negative, finitely 
additive measure on %. A partition A is a finite class of disjoint sets in B 
whose union is J. The partitions are ordered by defining A’ > A whenever 
A’ is a refinement of A. For f(x) real-valued on the domain I and A = {K, 
...,E,} any partition, let 


(10.1) s(A) = Sf (x.)u(Ey) 











(J 


Si 











TOPOLOGICAL VECTOR LATTICES 295 


where x, is any point in E, and k ranges through 1,..., n. In general, s(A) 
is a many-valued function of A, a particular value depending on the choice 
of x, in E,. If lim s(A) exists (in the Moore-Smith sense (8)) uniformly for 
all such choices, then f is said to be integrable. 

Introducing the upper and lower Darboux sums 


(10.2) 8(A) = > sup f(x,)u(E,) 


and 
$(A) = pi inf f(x,)u(E,), 


let S(A, f) = 3(A) — (A). In (10.2) we assume ~ .0 = 0. Since lim sup s(A) 
= lim §(A) and lim inf s(A) = limg(A), f is integrable if, and only if, 
lim S(A, f) = 0. Note that for any f, S(A,f) is a decreasing function of A. 
Since S(A, af + Bg) < |a| S(A, f) + |8| S(A, g) the integrable functions form 
a vector space. Since S(A, 1) = 0 the constant functions are integrable. That 
products of integrable functions are integrable follows from the inequality 
S(A, fg) < M(f) S(4, g) + M(g) S(4,f) where M(f) is the supremum of 
f(x)| for x restricted to those sets in A which are not of measure zero. That 
f| is integrable whenever f is integrable follows from the inequality S(A, |f|) 
< S(4,f). Given |f(x) — g(x)| <¢ for all x we have S(A,f) < S(A, g) 
+ S(A,f — g) < S(A, g) + 2eu(J). So a uniform limit of integrable func- 
tions is integrable. Since an integrable function is bounded except on a set 
of measure zero, we shall consider only bounded integrable functions. These 
form a topological lattice algebra under uniform convergence with the usual 
ordering and algebraic operations. Using Theorem 2, we shall show that this 
algebra is generated by its idempotents. Thus, it suffices to show that for 
f any bounded integrable function, f~ can be separated from f+ by integrable 
idempotents. 

Consider any « > 0. Choose a sequence A, of partitions such that A,,; > A, 
and S(A,, f) — 0, which is possible because f is integrable. Let C, be the union 
of those sets E, belonging to the partition A,, for which there exist x and y 
in E with f+(x) > « and f-(y) > «. By induction, starting with Ay = By = @ 
and Cy, = I, let A, be the union of A,_; and those sets E in A, which are 
contained in C,_, and have ft(x) < « for all x in E. Let B, be the union of 
B,-, and those sets E in A, which are contained in C,_;, have f~(x) < « for 
all x in E, and have f*(y) > « for some y in E. Then A,_, is a subset of A,, 
B,-: of B,, and C, of C,_1. Since 2eu(C,) < S(A,, f), we have u(C,) — 0. Let 
A = lim A, and C = lim C,. Let E be the union of A with the set of all 
points x in C for which f+(x) = 0. Let e be the indicator of E: 


™ j lforxin E 
(10.3) ®) = 1 0 for x in E’. 


Since A, is contained in E and B, is contained in E’, e(x) equals 1 for x in 
A, and 0 for x in B,. Hence, S(A,, e) < u(C,) which converges to 0. So e 











296 SOLOMON LEADER 


is integrable. For x in E either x is in C with f*+(x) = 0 or x belongs to some 
A,, implying f+(x) < «. Clearly then ef*+ < el. For x in E’, either x is in C 
with f*(x) > 0, hence f-(x) = 0, or x is in some B,, implying f-(x) < «. So 
(l —e)f- < el. 

Thus, by Theorem 2, the algebra of bounded integrable functions is gener- 
ated under uniform convergence by its idempotents. 

A similar result can be obtained for the almost everywhere continuous 
functions on a closed interval, using Theorem 2. Combining these two results, 
we get Lebesgue’s characterization of the Riemann integrable functions (7). 


REFERENCES 


1. G. Birkhoff, Lattice theory, A.M.S. Coll. Pub. (New York, 1940). 

2. G. Birkhoff and R. S. Pierce, Lattice-ordered rings, Anais da Acad. Brasileira de Ciencias, 
28 (1956), 41-69. 

3. S. Bochner, Additive set functions on groups, Ann. Math., 40 (1939), 769-99. 

4. S. Bochner and R. S. Phillips, Addétive set functions and vector lattices, Ann. Math., 42 
(1941), 316-24. 


5. H. Freudenthal, Teilweise geordnete Moduln, Proc. Acad. Wet. Amsterdam, 39 (1936), 
641-51. 

6. S. Leader, The theory of L®-spaces for finitely additive set functions, Ann. Math., 58 (1953), 
528-43. 


7. H. Lebesgue, Lecons sur l’intégration et la recherche des fonctions primitives, Gauthier- 
Villars (Paris, 1928). 

8. E. H. Moore and H. L. Smith, A general theory of limits, Amer. J. Math., 44 (1922), 102-21. 

9. H. Nakano, Modern spectral theory (Tokyo, 1950). 

10. I. Namioka, Partially ordered linear topological spaces, Amer. Math. Soc., Mem. 24 (1957). 

11. S. Saks, Theory of the integral (Warsaw, 1937). 

12. M. H. Stone, Applications of the theory of boolean rings to general topology, Trans. Amer. 
Math. Soc. 41 (1937), 375-481. 


Rutgers Uniwersity 














TENSOR PRODUCTS OF BANACH ALGEBRAS 


BERNARD R. GELBAUM! 


1. Introduction. This paper is concerned with a generalization of some 
recent theorems of Hausner (1) and Johnson (4; 5). Their result can be 
summarized as follows: Let G be a locally compact abelian group, A a commu- 
tative Banach algebra, B' = B'(G, A) the (commutative Banach) algebra of 
A-valued, Bochner integrable functions on G, It, the maximal ideal space of A, 
WM, the maximal ideal space of L'(G) (the (commutative Banach] algebra of 
complex-valued, Haar integrable functions on G), Wt; the maximal ideal space 
of B’. Then M; and the Cartesian product NM, K Ms. are homeomorphic when 
the spaces I,, i = 1, 2, 3, are given their weak* topologies. Furthermore, the 
association between IN; and IW; K Ms is such as to permit a description of any 
epimorphism E;: B' — B'/M; in terms of related epimorphisms E,: A — A/M, 
and Ez: L'(G) — L'(G)/Mo, where M, is in Mi, i = 1, 2,3. 

On the other hand, Hausner (2) (and the author, independently) showed 
that a similar result is valid for generalized continuous function algebras. One 
form of the theorem is the following: Let X be a compact Hausdorff space, A a 
commutative Banach algebra, D = C(X, A) the (commutative Banach) algebra 
of A-valued continuous functions on X, IM, the maximal ideal space of A, Me 
the maximal ideal space of C(X) (the (commutative Banach| algebra of com- 
plex-valued continuous functions on X), I; the maximal ideal space of D. Then 
M; and the Cartesian product M; X Ms are homeomorphic when the spaces 
M,, i = 1,2,3, are given their weak* topologies. Furthermore, the association 
between M; and Mt K Ms is such as to permit a description of any epimorphism 
E;: D — D/Ms; in terms of related epimorphisms E,: A — A/M, and E2: C(X) 
— C(X)/M2, where M, is in M,, 1 = 1, 2,3. 

The crucial point in the latter theorem is the proof that D is spanned by 
“simple”’ functions, that is, functions which are linear combinations, with 
coefficients in A, of complex-valued continuous functions on X. On the other 
hand, the very definition of B' shows that it is spanned by “‘simple’’ functions, 
that is, this time, functions which are linear combinations, with coefficients 
in A, of complex-valued, Haar integrable functions on G. Clearly, in each 
instance, the collection of ‘‘simple’’ functions is an algebra which is a tensor 


Received May 6, 1958. This research was supported by the United States Air Force through 
the Air Force Office of Scientific Research of the Air Research and Development Command, 
under contract No. AF 49 (638)-64. Reproduction in whole or in part is permitted for any 
purpose of the United States Government. 


'The author is indebted to Professor G. K. Kalisch for many stimulating conversations on 
the subject matter of this investigation. 


297 








298 BERNARD R. GELBAUM 


product of A and some complex function algebra, and the object of discussion 
is the completion of this tensor product with respect to an appropriate norm. 


2. Tensor products. Let A; and A; be Banach algebras and let A,’ = 
A; @ Az be their algebraic tensor product (0). As is well known, (6) there 
are generally many norms which can be given to A;’ in terms of the norms of 
A, and A». Our first result is about one of these norms. 


THEOREM 1. Let ||... ||; be the norms in A,, i = 1,2. Then the “‘greatest 
cross norm” (6) defined in A;' by 


nm n 
| of? @ o'l! = int $ lo! los, 
t=—1 i=1 
where the inf is taken over the equivalence class which defines 
n 
> aS” @ aS”, 
t=1 


ts a Banach algebra norm which satisfies the relationship 


, 











P’ || a: @ ae ||;’ = @y ||1 || ae | Ie. 
Proof. In (6) the validity of the last equality is shown. We prove here the 
fact that if p, g are in A;’, then ||pql|s’ < ||p/|s'||¢\|s’. To this end, let r > 0 
be given. Then there is a choice of a;", a2‘, b;, bo for which 


n m 
So eas ad > ed? 
t=1 j=l 


define the respective equivalence classes of p and g and for which 


llPlisilalls > (= $I hlls11s)( 3 ata, —f. 
The last expression is 


z Jas? |||) ||al|aS| 2] 15” ||2 — 


which majorizes 
Do aia” ||1||a3°OS” ||2 — r. 
i,j 


Obviously, the last expression majorizes ||q/|;’ —.r. Since r is an arbitrary 
positive number, we see |!p||3’||q\|s’ > ||pq!|s’. This completes the proof. 


THEOREM 2. Let A; be the completion of A;' endowed with the ‘‘greatest cross 
norm” ||. ..||3’. Let A, be commutative and let M, be their respective maximal 
ideal spaces with their respective weak* topologies, i = 1,2. Then A; is a com- 
mutative Banach algebra. Its norm ||. . .\\3 satisfies the analogue 


bu lay ® ao||3 = |/ayl!sllaells 








~_ _ fer oon Uk ee 


— = we 








PRODUCTS OF BANACH ALGEBRAS 299 





































of B’. If \\.. .\|s" is any tensor product norm relative to which A,’ is a normed 
algebra, with no dense reg. max. ideal (for example, if || - - - |\\3’" is the greatest cross 
norm), and if A; is the completion of A;' relative to || - - - |\s'", then As is a com- 


mutative Banach algebra and its maximal ideal space I; in its weak* topology is 
homeomor phic with the Cartesian product IM, X Ms. Let t be the homeomorphism 
the existence of which is asserted: t: M; — Mi K Me, and let t(M3) = (M,, M2). 
Then the epimorphisms 


Ei: A a => A i/ My, E:: Ao A 2/ Mo, E;: A oe A 3/ Ms, 


are uniquely determined by the respective maximal ideals M,,i = 1, 2,3; E, and 
E, together determine E; and conversely. 


Proof. The commutativity of A; and the validity of $ are clear consequences 
of the hypotheses. 

Since A ,/M,, i = 1, 2,3, is the complex numbers, and since each epimor- 
phism E,;, i = 1,2,3 commutes with multiplication by complex numbers 
(cE,(a) = E;,(ca), c complex, a in A,) and since the complex numbers admit 
no non-trivial automorphism which commutes with multiplication by com- 
plex numbers, the uniqueness of the EZ, follows. 

We now proceed to set up a 1 — 1 correspondence between 22; and Mt; *K Mh. 
After this has been accomplished, the correspondence will be shown to be a 
homeomorphism. With a view to greater ultimate generality, we shall, how- 
ever, show how to establish the kind of correspondence we need between 
M: X M+ and a part of Mt; under conditions far less restrictive than those 
imposed in the hypothesis of Theorem 2. This correspondence will serve when 
the hypothesis of Theorem 2 is in force and will in fact prove to be the homeo- 
morphism which is sought. What follows then is an interlude, justified and 
required by economy. 

During this interlude we shall not assume that A; and A» are commuta- 
tive. M, and Mt, will denote their respective spaces of (two-sided) regular 
maximal ideals. For each pair (M,, M2) in Dt: K Meo, let E; and E, be some 
epimorphisms E,: A,— A;/M,, i = 1, 2. For p in A;’, define E;’ by the 
formula 


Ex(p) = > Ex(ai”) @ E2(a%”), 
i=—1 
a member of the tensor product (A;/M;) @ (A2/M2), where 
: (4) (ft) 
> a; @® a, 


is some representation of p. Clearly E;'(p) does not depend on the repre- 
sentation of p and is an epimorphism of A;’. E;’: As’ — (Ai/M,) @ (A2/ M2). 
Let A;’ have the norm ” and let E;’(A;’) have the quotient space 
norm (which is independent of the choice of E; and E,). The quotient space 
norm is admissible as a true norm since A;’ has no dense reg. max. ideal. Then, 








300 BERNARD R. GELBAUM 


relative to these topologies, E;’ is a bounded (hence uniformly continuous) 
transformation of A;’ and has a unique extension E;, an epimorphism of A; 
onto the completion of (A:/M,) @ (A2/M:2) relative to its (quotient space) 
norm. 

Let M; = E;"'(0). Ms; is an ideal in A;. If u,; are identities modulo M, in 
A, (4 = 1,2), then E3(u; @ uz) is an identity in E;(A;3), whence M; is 
regular. Consequently, there is a regular maximal ideal N; which contains 
M:;. We shall show that NV; and M; are the same. 

For this purpose, we define two mappings G, of A, into E;(A;),7 = 1, 2, as 
follows: 

G,(a,) = E3(aw), ¢= 1,2, 


where a, is in A, and u is an identity modulo M;. Clearly G,(a,) is independent 
of the choice of u. Let F; be some epimorphism, F;: A; — A;/N;3, and let H,, 
i = 1,2, be engendered by F; as G,; are engendered by E;. We will show 
that M, = H;'(0) = G-'(0) = E;-"(0), (4 = 1, 2). 

If a; is in M,, then E,(a;) = 0, whence E;(a\u) = E;(aitt1 @ ue) = 
E,(ayu;) @ E2(uz) = 0 (where u = u; @ ue). Thus G;(a:) = 0 and hence M, 
is contained in G,;—'(0). Since G,;~'(0) is a proper ideal and M, is a maximal 
ideal we see that M,; = G,'(0). 

Since a@:%; @ u: is a member of M; which is a subset of N;, it follows that 
F;(ayu; @ uz) = 0 = H,(a1). We see that M; is contained in H,~'(0) which 
is a proper ideal of A;. Since M, is maximal, it follows that M, = H,—'(0). Of 
course, by definition, M, = E,-'(0). Analogously, we can show M, = H,—'(0) 
= G;"(0) = E,-'(0). 

In order to continue we shall require the following lemmas. 

LemMA 1. Let A be a Banach algebra, I a closed ideal of A and let E and 
E” be two epimorphisms, E: A — A/I, E"': A—A/I. Then, relative to the 
quotient space (norm) topology of A/I, there is an isometric automorphism a 
of A/I, «a commutes with complex multiplication and E = a E”. 


Proof. For 6 in A/I, let E’’(a) = 6 and let a(b) = E(a). If E’’(a’) = 5, 
then a’ — a is in J, whence E(a’) = E(a) and thus a(6) is uniquely defined. 
If E(a’”’) = b, then aE” (a) = E(a’’) = b, whence a is an automorphism, 
which clearly commutes with complex multiplication. Finally, if || - - - ||4 and 
||---|| are the respective norms of A and A/I, we see 


\|a(b)|| = ||E(@)|| = inf {\\a + a||4\¢ in JZ}. 
On the other hand, 

\|b|| = ||” (a@)|| = inf {\|a + a|,\¢ in J}, 
whence ||a(d)|| = ||5|| and @ is an isometry. 


LemMA 2. Let A, be Banach algebras, I, closed ideals in A,, E;, Ej’ epi- 
morphisms, E;: A, — A;/I;, Ed’: Ay — Ai/T & the isometric automorphisms 

















PRODUCTS OF BANACH ALGEBRAS 301 


(Lemma 1) for which E, = a,E/’, i = 1,2, and let a be the tensor product 
a @ ar. If A; @ Ax is given some tensor product norm with respect to which 
A; @ Az becomes a normed algebra, then a 1s an isometric automorphism of 
(A1/I;) ® (A2/I:2) relative to the quotient space norm described earlier. If E’ 
and (E’)” are the respective epimorphisms engendered by E,, E, and E,"’, E,", 
then E’ = a(E’)". 


Proof. The fact that a is an automorphism is clear, as is the relationship 
E’ = a(E’)”. If 6 is in (Ai/I;1) @ (A2/J2), then a representative of 6 is an 
expression of the form 


DX Ei(ai°) @ Ey(a}"). 
t=1 

A representative of a(6) is the expression 
> Ex(ai”) @ E:(a3”). 
t=] 


The argument given in Lemma 1 may be repeated mutatis mutandis to 
show that a(b) and } have the same norm. 


LEMMA 3. Let A be a normed algebra and let a be an isometric automorphism 
of A which commutes with complex multiplication. If the completion A of A is 
simple, so is the completion aA of aA. 


Proof. Since a is an isometry, it may be extended in a unique fashion to 


an isometric automorphism @ of A which commutes with complex multipli- 
cation. Clearly a(A) = aA, whence the simplicity of A implies the sim- 
plicity of aA. 

From the preceding paragraphs and lemmas we can conclude that there 
exist isometric automorphisms a;, 8; of A,/M, which commute with complex 
multiplication and which satisfy the relations H, = a,G,; = 8,E,, (¢ = 1, 2). 
If 8 is the tensor product 8; ® 82, then 8 is an isometry of (A,/M,) and the 


following relationship is valid: 
F;(As’) = B((A1/M,) ® (A2/M2)) = BE3(Ay’). 


Since the completion of F;(A;’) is F3(As) which is simple, and since 8~' is 
an isometry, we see (Lemma 3) that the completion of 8-'F;(A;’) is simple 
and hence that the completion of E;(A;’) is simple. Hence M; = E;~'(0) is 
a regular maximal ideal, and thus M; = N;3. 

We have shown how to associate with each pair (M,, M2) a maximal ideal 
M;. The method of association demands that we show that M; is uniquely 
determined in this manner by the pair (M,, M2), regardless of which epi- 
morphisms E,, E, etc., are used in the construction. 

To this end, suppose that E,’"’, E,”’ are chosen in place of E,, E:, at the 
beginning of our construction. Then there are automorphisms y, of A«/M;, 
such that EE,’ = y,E,, (i = 1,2). Let E;” be engendered by E,”, E,” as E; 











302 BERNARD R. GELBAUM 


is engendered by E,, Es, and let EZ; engender G,’’, G:"’ as E; engenders Gi, 
G:. Then there are automorphisms x, of A;/M, which satisfy G,/’ = r,E,, 
i = 1, 2. If we set x equal to the tensor product 7; ® 2, we see that E;” =2E; 
and hence that (E;’)-'(0) = E;-'(0). Thus M; is uniquely determined, even 
though the epimorphisms involved in its determination are not unique. 

The interlude is over and we continue the proof by using the complete 
hypothesis of our theorem. 

We proceed to establish a correspondence between elements of Jt; and 
elements of Dt; K Me. If M; is in Mi; and if EZ; is the (unique) epimorphism, 
E;: A; A3/Ms3, we can define G;, 1 = 1,2, as we did above in the more 
general context. This time we define M,; to be G,;-'(0), (¢ = 1, 2). We will 
show that M, and M, are maximal ideals which engender, in the manner 
described above, a maximal ideal which is precisely M;. The circle will thereby 
be closed. 

The commutativity of A; and A» implies that A;’ (and hence A;) is com- 
mutative. Let u be an identity modulo M;. Then if p (in A;’), represented by 


n 

(4) (4) 
D> ai” @ as”, 
i=1 


is so near to u that E;(p) ¥ 0, we see that some term in the representation 
of E;(p) is not zero. Hence for some %, 


Gi(ai), G2(as) 


are both not zero. It follows, since A;/M; is the complex number system, 
that G, are non-trivial epimorphisms, G;: A;—> C (the complex number sys- 
tem), whence M, are maximal ideals, (7 = 1, 2). 

Clearly, if M;’’ is the maximal ideal engendered by the M, in the manner 
described earlier, then M;’’ contains M3, and, since M; is maximal, M;"’ and 
M; are the same. 

Thus we have established a 1-1 correspondence ¢ between Jt; and 
Msi K Me. 

The homeomorphism between J; and Mt: XK Mts can be established as 
follows. If a is in a commutative Banach algebra A, M is a maximal ideal 
of A, then a*+(M) denotes the complex number into which a is mapped when 
A is reduced modulo M. If Mo; is in Ms, if t(.Mos) = (Mo1, Moz) and if 
N(Mo, Moz) is a neighbourhood of (Mo:, Moz) in Dt: X Mte we may assume 
N( Moa, Moz) is of the form N(Mo:) X N(Mo2) where N(Mo,) are neighbour- 
hoods of Mo, in M, (4 = 1,2). But 


N(Ma,) = {M, |aj(M,) — a5(Mo)| < 747 = 1,2,..., Ju 74 > 0}. 
Consider 


N (Mos) = 
{M;\(ay ® u2)* (M3) = (ay ® U2)*(Mos3) < 1,7 = a 2, coes Ji, 
| (ty ® @ j2)+ (Ms) _ (11 ® a ,2)*(Mos)| < ro,j = l, 2, ees , J2}, 











Pr -_ — — WY 











PRODUCTS OF BANACH ALGEBRAS 303 


where u, are identities modulo M,, i = 1,2, and t(M;) = (M,, M2). Since 
(uy 2 a2)* (M3) = a2*(M2), we see t(.N (Mo3)) is contained in N(Mai, Mo). 
On the other hand, let 


N (Mos) = { M3\a,( M3) — a;(Mo;)| < rij = i 


Choose 
» = a? @ a’? in A3 
t= 

so that ||a, — P,||3"’ < r/3, 7 = 1,2,...,J. Let 


N(Ma) = {M, ja‘?*(M,) — a‘?*(Mo)| < r/(6Jn(2R; + 1)), 
fs eer eee 


where 
4 ) 
; { ‘ 
n= >, m,,Ri = sup,,;{|ja‘?||1}. 


j=l 


Similarly let 
N(Mo2) = {M3| |aS?*( M2) — a{?*(Mo2)| < r/6Jn(2R: + 1)), 
5 Ee vctpttas @ Re coce dh 
Then if M; = (M,, M2) is in N(Mo1) KX N(Mo2), we see 
ja; (Ms) — a;(M)| 
< |(a; — Ps)*(Ms)| + |(@, — Ps)*(Mos)| + |P7(M) — P7(Mos)| 
< 2r/3 + > a\?*(M,)aS2*(M2) — a$?*(Mo)aS?* (Mor) 


and hence f-'(N(Mo1) X N(Mo2)) C N(Mpo;), and t is a homeomorphism. The 
proof of Theorem 2 is complete. 


The following remarks are in order at this point.* 


1. A little reflection shows that B'(G, A) is the completion of the tensor 
product of L'(G) and A relative to the norm: 


\| 2 lly | | 
} » Ay(x)a, |= fi » Ay(x)a,}| dx. 
| | gael G i=1 A 
The result of Hausner and the author shows that C(X, A) is the com- 
pletion of the tensor product of C(X) and A relative to the norm: 


| = Au(x)a, ! = cup | > Au(x)ad| |x ¢ xt. 
| i=] ) 


A 


*At the time of the writing of this paper, the author was unaware of the results of Willcox 
(8) and of the appearance of Hausner’s paper (3). Clearly the spirit expressed in the second 
paragraph, p. 876 of (8) has motivated much of our study. 











304 BERNARD R. GELBAUM 


3. The tensorial approach explains and unifies a collection of phenomena 
and symmetries hitherto observed without comprehension. 

For example, Johnson (5) shows that B'(G, L'(H)) and L'(G X H) are 
isomorphic if G and H are locally compact abelian groups. From our standpoint 
B'(G, L'(A)) is the completion of the tensor product L'(G) @ L'(A). Relative 
to this format, Johnson’s theorem is essentially the statement that L'(G) 
@ L'(H) (completed) and L'(G X H) are isomorphic. The symmetry and 
truth of this statement are clarified by the tensorial viewpoint. 


4. When either of A, or Az is non-commutative, the most important 
topologies for the associated spaces of two-sided regular maximal ideals are 
the kernel-hull topologies (7). In general, under these circumstances, A will 
be non-commutative and even if the “natural” 1-1 map?: M;—-Di K M: 


can be constructed, the question of the bi-continuity of ¢ seems to be open. 


5. The property §$ of || - - - ||; is irrelevant to the existence of the homeo- 
morphism ¢. The impact of Theorem | and the associated part of Theorem 2 
is the existence of norms for A;’ and A; relative to which they become normed 
or Banach algebras. 


6. When A, and A: are not assumed to be commutative, the following 
results obtain: 

(i) If Ay and Az have ‘approximate identities,”’ then the 1-1 mapping t: 
Ms; —+ Mi XK Ms can be constructed. 

(ii) If Ay and Az have identities e; and ez, and if t(Ms3) = (M,, M2), then 
Mz = M;0)\ (e; ® Az) and M, = M;0) (A, @ e2). 


Proof. Ad(i) By an “approximate identity” in A, is meant an A ,;-valued 
function v;,, on a directed set P; such that 
lim p, Vigli = A 
for any a; in A,, (it = 1,2). We have observed that Dt; XK Me is always 
naturally embedded in Yt;. On the other hand, for a given M; in MJ; the 
construction of the naturally associated pair (M,, Mz) can begin with the 
mappings G;, G2 as above. This time, however, the proof of the regularity 
and maximality of the relevant ideals proceeds differently. 
First, recognizing that A; is an A,-module, we remark that 
limp; Vip 9 = 4G, += 1,2, 
for any g in A;. Thus, if uw is an identity modulo M;, 
lim p, E3(vit) = E;(u) = ¢é, 
the identity of A;/M;. This means that G,(A,) has e as a point of closure. 
On the other hand, G,(A ,) is a complete normed space, and thus ¢ is in G,(A,), 
whence G;-'(0) = M, is regular, i = 1,2. The maximality of M, can be 
established as in the previous case, once the regularity is known. Of course, 
our observations on the ambiguity of the epimorphisms can be repeated. 








‘" er 





PRODUCTS OF BANACH ALGEBRAS 305 


Ad (ii) The existence of ¢ is assured by i. ¢, @ Az and A, are isomorphic. 
Clearly M;f\ (e: ® Az) is isomorphic to some ideal Nz in A». Let Ey, Gy 
have meanings as given earlier. Then, since ¢; @ e2 is an identity modulo M;, 


G2(@2) = E3((e:1 @ €2)a2) = Es(e: @ az) 


and G:(a2) = 0 if and only if E3(e; ® a2) = 0, that is, if and only if e; @ a, 
is in M;. Since e; @ a2 is in e; ® Az we see G;(a2) = 0 if and only if e; @ az 
is in M;\ (e; @ Az) = No. Thus Nz = Mz = G;'(0). 


3. Group Representations. In the particular case where A; = L(G), 
G is a locally compact abelian group, and A; is a commutative Banach algebra 
with an involution and an identity, there are some interesting group repre- 
sentations which can be found. 

If a(x) is in G* (the character group of G), then for f(x) in A; = B'(G, A:), 
the mapping 2.: A; — A: defined by 


ra(f(z)) = J f(e)ate) dx 


is a homomorphism. If we define an “involution” in A; by the formula 
f* (x) = (f(x~'))* where * is the involution in A», then Ta(f* (x)) = (a(f(x))*. 
Clearly x, is continuous, and actually x, is an epimorphism (which com- 
mutes with multiplication by elements of A), since ma(A(x)e1) = e;, if A(x) 
is in L'(G) and A*(a) = 1. 

On the other hand, let § be the non-empty set of inverses in A, and let 
m be a t+A2-epimorphism: r: A; — A», that is, e commutes with multiplica- 
tion by elements of Az and (ft) = (xf)*. For arbitrary f in w~'() define 
a,(x) by the formula (x(f,))(xf)—'. Then, in the usual fashion, one can show: 
a, (xy) = a,(x)ar(y); ae(x-") = (ay (x))*; ae(e) = €2; ae(x) is f-free, bounded, 
continuous; +(u,z) — a,(x) for any approximate identity {u} in L'(G), where 
e is the identity of G, e2 is the identity of A. We call a,(x) a unitary representa- 
tion of G into &. The direct computation which follows shows that 


w(f(e)) = ff fe) (ae(x))* ae. 


If we let g be in x~'(), then 
J) (eu(2))*ax = (f 902) x(e,-s)ax) (n(@)) = +(fog)(x(g))* = #(f). 


Hence there is a 1—1 correspondence between tA:2-epimorphisms x of A; onto 
A, and unitary representatives a, of G into &. 


Let G denote the group of all such unitary representations a,(x). The 
compact-open and weak* topologies (for mappings of A; into A) are identical 
for G. In general, G is not locally compact. 











306 BERNARD R. GELBAUM 


The proofs of the last two statements are straightforward and are therefore 
omitted. 

If Mz in Me is fixed, then (a,(x))+(M:2), as a function on G is a member 
of G*. Hence, for each M, in Ms, there is an epimorphism 


Ey,:G — G", 
given by: 
Ey, (ar(x)) = (a,(x))* (M2). 


If x is fixed, then (a,(x))*+(M,) is a Gt-valued function on Qs, and actually 
(ae(x))*(Me) is in C(Ms, Gr). 

If Az and A;* are isomorphic, if A2* = C(M.), that is, if Az and C(Me) 
are equivalent, and if 8 is in C(Mts, Gt), define wg in A; by: 


a(f(x)) -(f 76 )B(x; M2) ax) (Mz). 


Then 
(ar,(x))* (Mz) = B(x; M2). 


We have thus far shown that there is a natural mapping y of G into C(M2,G*) 
given by 
v(ae(x)) = a,(x)*(M2) 


and that if A» and C(Q:) are equivalent, then the natural mapping 7 carries 
G onto C(M:2, Gt). 

Before stating the next theorem we shall require the following discussion. 

If a(x) is in Gt, then for each x there is a unique real number 8(x) 
0 < B(x) < 2x such that a(x) = exp (i8(x)). For example, if G is the circle 
group (the reals reduced modulo 27), and if a(x) is in Gt, then there is an 
integer m such that a(x) = exp(i{mx}) where {nx} is the residue of mx modulo 
2x. Although exp (i{x}) is a character in this case, exp (i{}{x}}) is not (since, 
for example, exp (i{}{24 — y}}) = exp (i($)(2% — y)) ~ exp (ir) = — 1 as 
y | 0, whereas exp (i{${2x — y}}) should approach 1 as y | 0 if exp (i{4{2x 
— y}}) is a character). Hence, in general, even if exp (i83(x)) is a character, 
exp (z{s8(x)}) is not necessarily a character for all real s. 

On the other hand, if G is the additive group of real numbers, and a(x) is 
in Gt, then there is a real number ¢ such that a(x) = exp (if{tx}), {tx} the 
residue of tx modulo 27. In this case, for any real s, exp (i{s{tx}}) is again 
a character. 

If a group G has the property that exp (i8(x)) is a character implies 
exp (i{s8(x)}) is a character, for all real s, we shall call G real-closed. 


THEOREM 1. If Az is semisimple y is 1-1; the converse is false. If Az and 
C(M.) are equivalent and G is real-closed, then y: G — C(M2, G*) is an 
isomorphism and conversely, if G is real-closed and y is an isomorphism, then 


= C(M:). 





















PRODUCTS OF BANACH ALGEBRAS 


Proof. Assume A; is a semisimple and assume 


¥(ae,) = ¥(ar;). 
If 


Qe, Za, then 7 ¥ me 


and there is an f in A such that (f/f) = a, # a; = w2(f). But 


(a1 — a2)*(M2) = (f p02) ante - ay(2))*de) (M2) = 0 


for all M2, a contradiction of the semisimplicity of A>. 
Assume A; has a radical, R2. Then R; is a non-trivial group R,°’ relative 
to the multiplication: 7; 0 r2 = 71 + r2 — rife. 
Now 
y(ar,) = y(ae;) if, and only if, a,,(x) — ae, (x) 
is in R, for all x. But 


Oe, (xX) — ate,(x) € Re for all x if, and only if, 1 — a,,(x)*a,,(x) = r(x) € Re 


for all x. Clearly r(x) is a representation of G into R,® (R, as a group re 0). 
Hence vy is not 1-1 if, and only if, there is a non-trivial representation r(x) 
of G into R,’. 

However, R,»’ contains no elements (different from 0) of finite order. For 


—— 


(r)o(r)o...o(r) =r" =1— (1 —71)*. 





If r° = 0, rin R,’, r 4 0, then (— 1)**'r” = Py where Py is a polynomial 
of degree N—n in n, with coefficients which are polynomials of degree not 


more than m—1 in r. Thus ||r*|| > 2*-"|!Q,||, where 


Q. = > (—1)"*),Cyr"*. 


Hence ||r*|/0/ > n@-")/||9,||C/) +n, as N— ©, a contradiction of the 
fact that r is in Ro. 

Thus if G has only elements of finite order, r(x) cannot be non-trivial. On 
the other hand, if G = R,° (with discrete topology), then r(x) = x serves. 
Hence the monomorphy of y depends both on the presence or absence of the 
radical in Az and on the nature of G. 

If Az and C(M.) are equivalent, then, of course, A» is semisimple and 
hence y is 1-1, and, as we have shown, an epimorphism. Thus y is an 
isomorphism. 

On the other hand, if y is an isomorphism and if A,* is a proper subset 
of C(M-) let z(M:) be in the complement of A,*+. Let z = u + iv. Then one 
of u, v is not in A2*, whence we may assume gz is real-valued. Since 1 is in 
A;*, A2* contains all constants and hence for some constant c, z(M:) + ¢ > 0, 








308 BERNARD R. GELBAUM 


all M,. Hence we assume for any w,0 <w <1, there is a 2(M;) in C(M) —A.* 
such that 


w = inf { 2(M2)|M2 in M2} < sup {z(M,2)|M2in M.} = 1. 


Choose g(x) in L'(G) so that: g = 0 outside some compact neighbourhood N 
of the identity e in G; g(x) > 0; g*(a) > 0; ||g\|1 = 1; g*(a) takes on at least 
three values. Clearly 1 = g*(e*) < ||g*|!.. < Iigil, = 1. Let g*(a») = w ¥ 0,1. 
Then 0 < w < 1 and we now assume 2(M;:) and w are related as indicated 
earlier. If ao(x) = exp (#8o(x)), let 


h(s) = J. exp(i {s8o(x)}) dx = fe exp(isBo(x)) dx. 


Then we see that for real s: 

(a) h(s) isin C® (— @, &); 

(b) h(s) is real; 

(c) |A™(s)| < K" where K = sup {|Bo(x)||x in N}. 
Hence h(s) is entire. Since 4(0) = 1, A(1) = w < 1, we see A(s) is not 
constant. Hence there is an interval (s’, s’’), 0 < s’ < s’” < 1 where h’(s) <0, 
and on (s’, s’’) h(s) has a continuous real-valued inverse: s = h-'(#), 
where h(s”) = t’ <t<t = h(s’). Let y(M:) = az(M:) + 5 be such that 
t”’ < y(M2) < t. Then y(M;) is not in A;* and if kR( M2) = A~'(y(M2)), then 
k(M;) is in C(M:), and k(M;) is real-valued. For f(x) = g(x)es (e2 the identity 
of Az) consider 


J fe) exp.ik(ate) pale) dx = ff f(x) explé (e(Ma)Bo(2)}) ax 
= h(k(M:2)) = y(M)2). 
Clearly exp (i{k(M]2)80(x)}) is in C(M.s, Gt) and by hypothesis (G is real- 
closed) there is an a in G such that y(a,) = exp (i{k(M2)8o(x)}). But then 


(x(f))*( M2) = y(M2), contradicting the fact that y(M,) is not in A,*. The 
proof of the theorem is complete. 


4. Miscellany. If G is locally compact abelian, A; = L(G), and if Az 
has an involution (A»2 is assumed to have no identity and is not assumed to 
be commutative) easily verified extensions of the above read as follows: 


1. Suppose A» is extended to A», by the adjunction of an identity. Let Ax, be 
the completion of the tensor product of A, and A>2,. Callan epimorphism wr: A;—A2 
extendable if there is an epimorphism r,: Az, — Ax, which coincides with x on 
A; (naturally embedded in A;,). Then the extendable epimorphisms 3: A; — Az 
are in 1 —1 correspondence with the unitary representations of G into the 
multiplicative group of Ar. 


2. If v(x) is a continuous homomorphism 
viG—Gi, 











PRODUCTS OF BANACH ALGEBRAS 309 
(the multiplicative group of Az relative to 0), then the formula: 


7) = f sear - J sero yras 


defines an extendable epimorphism.x: A; — Az. The extension of x is given by 
the formula: 


r(nte)e + Fle) = J lee + f))(e — (0(e))* Dae. 


More generally, if a(x) is a mumerical function, a.(x) an A>z-valued function 
such that (1 — a(x))(e — a2(x)) = u(x) is a unitary representation of G into 
the multiplicative group of A2,, then the formula 


r(f) = J sea — a(x))dx — J fe) (axe))*ae 


defines an extendable epimorphism x: A;— Az. The extension x, is given by 
the formula 


m.(A(x)e + f(x)) = J awe + f(x)) (u(x) )*dx. 


Conversely, an extendable epimorphism 3: A; — Az serves to define two func- 
tions a(x), a2(x) such that (1 — a(x))(e — a2(x)) = u(x) is a unitary repre- 
sentation of G into the multiplicative group of A2,. 


The last result stems from defining 7, on A>», as follows: If ao, is in Ax, 
and d2, = m,(A(x)e+f(x)) then let T7,(a2,) = 2,(Ae +f) — 2, (Are + f,). 
This definition of 7, is (Ae + f)-free. T, satisfies the classic criterion: 


T,(ab) = (T,a)b 


for membership in A», considered as.a subalgebra of the ring E(A,) of endo- 
morphisms of A,. Hence 7, = a(x)e + a2(x) and the verification of the 
result follows immediately. 


Remark. The criterion mentioned above is not valid for algebras having no 
identity. For example, if Az = L'(— ~, ~), Tf = f,, then T(f*g) = (7f)*g. 
But there is no h in Az such that 7f = h*f, as is well known. 

The standard techniques also show that a(x)e + a2(x) = lim ,#,(u,e) for 
any approximate identity {u} in A. 











310 BERNARD R. GELBAUM 


REFERENCES 


0. N. Bourbaki, Eléments de mathématique, VII, Premiére partie, Les Structures fondementales 
de l’analyse, livre II, Algébre, chapitre III, Algébre multilineaire, Act. Sci. et Indust., 
1044 (Paris, 1948), 30-8. 











1. A. Hausner, Abstract 493, Bull. Amer. Math. Soc. (July, 1956), 383. 

2. Proc. Amer. Math. Soc., 8 (1957), 246-9. 

3. The Tauberian heorem for group algebras of vector-valued functions, Pacific J. Math., 
7 (1957), 1603-10. 

4. G. P. Johnson, Abstract 458, Bull. Amer. Math. Soc. (July, 1956), 366. 

5. To appear in Trans. Amer. Math. Soc. 

6. R. Schatten, A theory of cross-spaces (Princeton, 1950). 

7. I. Segal, The group algebra of a locally compact group, Trans. Amer. Math. Soc., 61 (1947), 


69-105. 
. A. B. Willcox, Note on certain group algebras, Proc. Amer. Math. Soc., 7 (1956), 874-9. 


University of Minnesota 

















A CLASS OF SOLVABLE GROUPS 
DANIEL GORENSTEIN anp I. N. HERSTEIN 


1. Introduction. Numerous studies have been made of groups, especially 
of finite groups, G which have a representation in the form AB, where A and 
B are subgroups of G. The form of these results is to determine various group- 
theoretic properties of G, for example, solvability, from other group-theoretic 
properties of the subgroups A and B. 

More recently the structure of finite groups G which have a representation 
in the form ABA, where A and B are subgroups of G, has been investigated. In 
an unpublished paper, Herstein and Kaplansky (2) have shown that if A and 
B are both cyclic, and at least one of them is of prime order, then G is solvable. 
Also Gorenstein (1) has completely characterized ABA groups in which every 
element is either in A or has a unique representation in the form aba’, where 
a,a’ are in A, and } #1 is in B. 

In this paper we shall analyse groups of the form ABA in which A and B 
are cyclic of relatively prime order. The techniques and methods used borrow 
heavily from those used in the aforementioned paper of Herstein and Kap- 
lansky. The authors became interested in the structure of ABA groups as an 
outgrowth of problems they considered while at a conference held at Bowdoin 
College in the summer of 1957 under the auspices of the Cambridge Research 
Center of the United States Air Force. 

In the body of the paper we shall use the following notation: If H is a 
subgroup of G, o(H), i(H), and N(H) will denote respectively the order of H, 
the index of H in G, and the normalizer of H in G. 


2. Two preliminary lemmas. We shall need a result on the transfer homo- 
morphism which is a slight extension of a result of Griin (3, p. 143); in fact, 
the result is essentially contained in Griin’s, but for the sake of completeness 
we present it here. 


LemMaA 1. Let G be a finite group, and A an Abelian subgroup of G for which 
(0(A), i(A)) = 1. Then the transfer of G into A maps the intersection of A with 
the centre of its normalizer onto itself. 


Proof. Since (0(A), 1(A)) = 1 it is clear that for any plo(A) the p-Sylow 
subgroup of A is a p-Sylow subgroup of G. 

We first contend that if A; is an Abelian subgroup of G and 0(A;) = o(A), 
then A, is a conjugate of A. Let S, # (1) be the p-Sylow subgroup of A;. 


Received June 16, 1958. The work of the second author was supported in part by OOR, 
Contract No. ORDOR-LO-P-2042/A11472. 


311 








312 DANIEL GORENSTEIN AND I. N. HERSTEIN 


Since these are p-Sylow subgroups of G, there isa y € Gsuch that S, = yS,’y~. 
If we replace A, by yA1y"' we may, without loss of generality, assume that 
S, CA and S, C Ai. 

If N(S,) # G, our contention follows by induction and from the fact that 
both A and A,, being Abelian, are contained in N(S,). If, on the other hand, 
N(S,) = G, we use induction on G = G/S, to conclude that A, A;, the images 
of A and A, in G are conjugate in G. Since both A and A, contain S,, their 
conjugacy in G follows at once. 

From this, and the usual argument made on the centres of Sylow sub- 
groups, we can say that if two elements of A are conjugate in G then they are 
already conjugate in N(A). 

We are now able to prove the lemma. For let a; be an element in the inter- 


section of A with the centre of V(A), we compute the transfer, r, on a;. Since 
A is Abelian, 


rT 


t(a,) = I] xix," where -u fi =t(A) and xaiix' € A. 


t=1 i=l 


However, since 


-1 
af and xix; 


are conjugate in G and are in A they are conjugate in NV (A); since a; is in the 
centre of N(A) they must be equal. Thus 


r(a:) = [] af = ai”; 
t=1 
and since (0(A), 1(A)) = 1, the lemma follows. 
The second result we shall need is contained in the following lemma. 


LEMMA 2. Suppose a finite group G admits an automorphism a of order h 
such that every element of G can be expressed in the form a‘(b*) for some fixed 
element b of G of order k. If (h, k) = 1, then G is either Abelian or is the direct 
product of an Abelian group of odd order with the quaternion group of order 8. 
If « leaves only the identity element of G fixed, then G is Abelian. 


Proof. We proceed by induction on the order of G. 

Suppose, first, that a leaves some element, ~ 1, of G fixed. Then for some 
e, t, a(a*(b")) = a*(b*), b§ # 1. Thus a(d*‘) = b*. But then for all i, 7 

a'(b’) . bt = at(b4)at(b*) = at(b*) = b'a'(d’), 
and so Bb‘ is in Z, the centre of G. 

Since the order of every element of G is a divisor of the order of 6, then 
for any prime ~, p|o0(G) implies p|k. We consider the cases when p ¢ k/t and 
p|\k/t separately. 

Suppose first that p 4 k/t. Let G = G/(b‘) and let & be the automorphism 
induced by a on G. If 6 is the image of 6 in G, then every element of G is 
clearly of the form @‘(6’). By our induction hypothesis the p-Sylow subgroup 











A CLASS OF SOLVABLE GROUPS 313 


5, of G is normal in G. Thus the inverse image of S, in G, is normal in G and is 
of the form S,-(5‘) for some p-Sylow subgroup of G. If s € S,, x € G, then 
xsx~! = b"4s,, s, € S,; since the order k/t of 5‘, is relatively prime to p, this 
implies 5‘ = 1, and so xS,x~' = S,, so S, is normal in G. 

Furthermore 


3 - 50) _S 

“@ ) “yy M1) (6°) 
by induction §,, and hence S,, is either Abelian or isomorphic to the quaternion 
group of order 8. 

If, on the other hand, p|k/t then 5*” is in Z, and being of prime-power 
order, must be in all ~-Sylow subgroups of G. By induction, S,, the p-Sylow 
subgroup of G = G/(b*”) is normal in G, so its inverse image, S,, must be 
normal in G. Thus S, contains all elements of G whose order is a power of 
p. We claim S, contains a unique subgroup of order p. For if a‘(b’) is of order 
p, then j is a multiple of k/p, so a‘(b’) = b’ since a(b*”) = b*”; thus the only 
subgroup of order p is (b*”). It is well known that a group of prime-power 
order having only one subgroup of order p is cyclic if p is odd and is either 
cyclic or a generalized quaternion group of order 2" if p = 2. Now S; isa 
normal subgroup of G invariant under a. Since by assumption (4, k) = 1 and 
since 2|k if S; * 1, we conclude that a is of odd order. If a reduces to the 
identity on S2, S2 is cyclic. Hence if S; is isomorphic to the generalized quater- 
nion group, a has odd order on S:. But for » > 3 the automorphism group of 
a generalized quaternion group of order 2" is of order 2*~'. Thus the only 
possibility in our case is m = 3, and S; is isomorphic to the quaternion group 
of order 8. 

There remains the case when a@ leaves no element of G, other than 1, fixed. 
In this situation it is known that for each prime p, a must leave some p-Sylow 
subgroup, say, S,, fixed. Since an element of S, is of the form a‘(b’), it follows 
readily that S, consists of all the elements of G whose order is a power of p. 
S, is then the unique p-Sylow subgroup of G and so is normal in G, and G is 
the direct product of its Sylow subgroups. We still must show that 5S, is 
either Abelian or the quaternion group of order 8. Thus we may, without 
loss of generality, assume that S, = G. 

Suppose then that k = p*. We compute the number of elements in G. 
Let r, be the least positive integer such that 


a (Bb) € (8). 


It is clear that the number of elements in G, of order exactly p*~‘, is 
r,(p*-* — p*-*"), i < s, and so 


(2.1) o(G) = p* = 1o(p*—p*") +ni(p?"' —p**) +... Hral(p—1) +1. 


However, the elements of order p in the centre of G form a characteristic 
subgroup of G, and so the elements of the form 


a'(b”"**) 





= S,; 











314 DANIEL GORENSTEIN AND I. N. HERSTEIN 


(that is, all the elements of order p), form a subgroup of G (in Z) of order 
p™, containing 7,:(p—1)+1 elements. So 


(2.2) p™ = 1,-1(p—1)4+1. 
Combining (2.1) and (2.2) we have 


p*—p™ = ro(p*—p*") + ... +1r.-2(P?—P). 


If m > 1, then p? divides the left-hand side, and so must divide the right- 
hand side; but then p'r,2. Since 7,2/k, and 1 = (h,k) = (h, p*), this is 
impossible. So m = 1, and we can conclude that G has exactly one subgroup 
of order p. We conclude, as above, that G is either cyclic or isomorphic to the 
quaternion group of order 8. 

The final statement of the lemma follows at once from the fact that the 
quaternion group has a unique element of order 2 and hence each of its 
automorphisms leaves this element fixed. 


3. The case N(A) = A. In this section we shall prove the following result 
concerning the structure of ABA groups: 


THEOREM 1. Let G be an ABA group, in which A and B are cyclic subgroups 
of relatively prime orders h and k respectively. Then if A is its own normalizer 
in G, G contains a normal subgroup T with A (\ T = 1. Furthermore T is either 
Abelian or the direct product of an Abelian group of odd order with the quaternion 
group of order 8. In particular, G is solvable, and of order hkw, where w\k’ for 
some integer v. 


Proof. We shall prove first that the Sylow subgroups of A are, in fact, 
Sylow subgroups of G. The proof is by induction on the order of G. 

Let S, be a p-Sylow subgroup of A. Since A is Abelian, N(S,) > A. If 
x = a,b,a2 € N(S,) with 5; € B, a:, a2 € A, then clearly 5; € N(S,) 1 B. If 
B, = B\ N(S,), then obviously N(S,) is of the form AB,A; thus if N(S,) 
is a proper subgroup of G, it follows by induction that the order of N(S,) 
is hkyw,;, where k; = 0(B,) and w,\k,’ for some integer v. Since (h, k:) = 1, 
we see then that S, is a p-Sylow subgroup of V(S,). But S, must then be a 
p-Sylow subgroup of G, since the normalizer of a proper subgroup of a p-group 
is always a strictly larger subgroup. 

On the other hand, if N(S,) = G, then S, is normal in G, and we con- 
sider G = G/S, = ABA, where A, B are the images of A and B. Furthermore 
N(A) = A; for N(A) > A would clearly imply N(A) > A since S, is con- 
tained in A. Hence we can apply our induction hypothesis to G, and we 
obtain o(G) = hkw, where w\k’ and h = o(A). Thus p+ o(G), and so S, is 
a Sylow subgroup of G. 

Since this holds for every ph, it follows that the order and index of A are 
relatively prime. Since A is Abelian, we may apply Lemma | to conclude 
that the transfer r of G into A maps the intersection of A with the centre 

















A CLASS OF SOLVABLE GROUPS 315 


of its normalizer onto itself. But by assumption, A is the centre of its nor- 
malizer, and so rt maps G homomorphically onto A. 

Let T be the kernel of +; since tr maps A onto itself, A (\ T = 1; since 
the order of B is relatively prime to the order of A, B C T. Ifa, bare generators 
of A, B respectively, it is clear that T consists precisely of the elements of G 
of the form a‘b/a~‘, where i, j are arbitrary. Now the mapping a defined on 
T by a(x) = axa™ is an automorphism of 7, and every element of T is of 
the form a‘(b’). Therefore by Lemma 2, T is of the form stated in the theorem. 

Since every element of 7, being of the form a‘(b’), has order a divisor of k, 
o(T) = kw where w\k’ for some integer »v. Since G/T = A, and A is cyclic 
of order h, G is solvable of order hkw. This completes the induction and the 
proof of the theorem. 


CorROLLARY. A Sylow subgroup of G is either Abelian or isomorphic to the 
quaternion group of order 8. 


4. The main theorem. 


THEOREM 2. Let G be an ABA group in which A and B are cyclic subgroups 
of relatively prime orders h and k respectively. Then 


1. G is solvable. 

2. The p-Sylow subgroups of G, for odd p, are Abelian; 

3. The 2-Sylow subgroup of G is either Abelian or isomorphic to the quaternion 
group of order 8. 


4. The order of G is hkw, where w\k”’ for some integer v. 
Proof. \f N(A) = A, the theorem follows immediately from Theorem 1 
and its corollary. We may therefore assume that V(A) > A. If a, b denote, 


as above, generators of A and B, there is an element of the form a‘b/a* in 
N(A) with 5’ # 1, whence BD itself is in N(A). Let r be the least positive 


r 


integer such that 6’ € N(A). Then r\k, and we have for some integer \ 
(4.1) b’ab-’ = a, where \*/7 = 1 = (mod A). 


Let p be a prime dividing k/r and define y, as the least multiple of r such 
that k/r is a power of p. Set 


B, = (67). 


In the first part of the proof we shall establish the following statement: 


The normalizer N(B,) of B, is of the form A,BA, for a suitable subgroup A, 
of A, and furthermore B, is in the centre of N(B,). 


We shall need one preliminary result. Since r;y,, we have 


(4.2) bab = a», i,**% = 1 (mod A) 














316 DANIEL GORENSTEIN AND I. N. HERSTEIN 


for some integer A,. Let u, = (A,—1, hk). We assert that 


h 
(., +) tae 


For suppose a prime q/(A, — 1). It is sufficient to show that if q*|k, then 
q‘\(Ap — 1). Since k/y, = p* for some integer s, (4.2) implies that 
y= 1 (mod h) 
and hence 
y= 1 (mod q‘). 
Write A, = 1 + xg’, where (x,q) = 1. Then (1 + xg*)”* = 1 (mod q*) and 
so p*xq’ + yq** = 0 (mod gq‘) for some integer y. Since p/k and g\hk, and 
(h,k) = 1, p # q and hence 6 2 «. 
We now return to V(B,). Suppose that a‘b’a* € N(B,). Then 
a‘b’a*b’'a ‘ba * = 
for some integer m. 
Applying (4.2) to this relation, we obtain 
(4.3) Ver re® a He. 


Suppose first that 





Uy 
Since u,|(A, — 1), (4.3) reduces to 
git — preim—D 


and their common value is 1, since a and 6 have relatively prime order. But 
u, = (1 —X,,’), and so 





moreover, 


m= 1(moa£), 
Yo 


and hence a‘b’a* commutes with 67. 
Conversely, every element of the form 
A. sh 
a’? bJ gt © 
is in N(B,) and commutes with 6”. To complete the proof of our assertion, 


we shall show that every element of N(B,) is of this form. We have just 
shown this to be the case if 


Uy 


ur eee & 

















A CLASS OF SOLVABLE GROUPS 317 


Suppose, on the other hand, that e(1 — A,) # 0 (mod A). Then (4.3) yields 
the relation 
bia 5? -_ a ‘ha tp 


and hence 6/a*"'~»)$~/ is in N(A). But this element has order dividing 4A; 
since (hk, k) = 1, all the elements of N(A) of order dividing A are already 
in A. Thus 


(4 4) big* > 5-? -_ a’ 


for some integer p. 
Using (4.4), we can rewrite the element 


‘ i —ze(l—a, +zre(1—Ay) 
a b’a* =a b/a ze( »a° ze( -) 


as 
a t—ze( I—Ap) 95,5, ozo 1—Apy) ; 


(1 -2,,2) = 1, 
? 


we can find an integer x such that 


Since 


e+xe(l —A,) = 0( mod b) 5 
Up 


If, for this x, we set 7’ = i — xe(1 — A,)p and e’ = e + xe(1 — A,), we have 
a‘b’a* = a“b”a", where 





hl. 
u, 
and hence 
a‘b’a®, =a “” b’ a”. 
If 


A,= (a"’), 


we have thus proved that N(B,) = A,BA,, and that B, is in the centre 
of N(B,). 


5. Continuation of the proof. The proof now proceeds by induction on 
the order of G, but we add the following statement to our induction hypo- 
thesis: if p/k then some p-Sylow subgroup of G consists of all the elements 
of the form a*‘b‘4a—*‘ for suitable integers s,¢ where i, 7 are arbitrary. 

There are three cases to consider, which we take up in succession. 


4 Case 1. N(B,) = G. In this case B, is in the centre of G, and we define 
G = G/B, = ABA, where A has order h and B has order y,. By induction, 
G is solvable of order hy,w, where w!y,’; so G is solvable and its order is 











318 DANIEL GORENSTEIN AND I. N. HERSTEIN 


(by wv) = = hhw 
Yo 


where w|7,’|R’. 

Hence the order of A is relatively prime to its index in G. Thus the Sylow 
subgroups of G, for any prime dividing 4, are cyclic. 

Furthermore, the Sylow subgroups S, of G for primes g which divide 7, are 
of the form {a*‘b“4a-**} for suitable integers s and ¢. If qg ¥ p, then it follows 
as in the proof of Lemma 2 that the elements 

a] 
{a*b? ao **} 
form a g-Sylow subgroup of G which maps isomorphically on S,. If ¢ = p, it 
follows again as in the proof of Lemma 2 that the complete inverse image of 
a suitable p-Sylow subgroup 8S, of G is a p-Sylow subgroup of G and is of the 
form {a*‘'b‘‘a—**} for suitable s, t. Thus for each prime p dividing k a p-Sylow 
subgroup S, of G is of the required form. If a denotes the automorphism of 
S, defined by a(x) = a*xa~* for x in S,, then every element of S, is of the 
form a‘(b‘’). Hence by Lemma 2, S, is either Abelian or isomorphic to the 
quaternion group of order 8. Our induction is therefore complete in the case 
N(B,) = G. 
We may therefore assume N(B,) < G. 


Case 2. p # 2. By our induction hypothesis some p-Sylow subgroup S, 
of N(B,) is of the form 


Ses h/ ‘ 
{a”” bY a ust } 


for some integers o and y:,and is Abelian. Since B, is in the centre of N(B,), 
B, C S, and hence 7:|\7,. We shall prove first that S, is cyclic. Suppose 5” has 
order p*. Then clearly S, is Abelian of type (p’, p’,..., p*). But since 


a’ lp 


commutes with 57, B, is the only subgroup of its order in S,, and hence S, 
is cyclic. 

Since S, is cyclic, B, isa characteristic subgroup of S,, and so N(S,) C N(B,). 
If S, were not a p-Sylow subgroup of G, its normalizer would contain a strictly 
larger p-group than S,. But since S, is a p-Sylow subgroup of N(B,) and 
since V(S,) C N(B,), it must be that S, is in fact a p-Sylow subgroup of G. 

But now by a theorem of Griin (or by Lemma 1), the transfer r of G into 
the Abelian Sylow subgroup S, maps G onto the intersection of S, with the 
centre of its normalizer. But B, is contained in the centre of its normalizer. 
Thus + maps G homomorphically on (5%) where y;\y2!7,. Since the order of 
A is relatively prime to that of B, A is contained in the kernel H of r. Also H 
contains some proper subgroup B* of B, and hence H is of the form AB*A. 
Since G/H has order k/2, B* is of order y2. Thus for any prime different from 
p, the Sylow subgroups of H are Sylow subgroups of G, and hence by induction 














A CLASS OF SOLVABLE GROUPS 319 


are of the required form. As we have already seen S, itself is cyclic and of the 
required form. By induction H is solvable and o(H) = hysw where w/y,” for 
some integer v. Since H and G/H are solvable, G is solvable and the order of 
G is 0(G/H)o(H). Thus 0(G) = (hyqw)k/v2 = hkw, where w\y2’"\k’. It follows 
at once that the Sylow subgroups of G, for any prime dividing h, is cyclic. 
The proof is complete in this case. 


Case 3. p = 2. If S: is Abelian, or if an odd prime divides k/r, the above 
proof holds without change. There remains then but one case to consider: 
namely, when k/r = 2* and the 2-Sylow subgroup S; of N(B:) is isomorphic 
to the quaternion group of order 8. In this case y, = y2 = r and B, = (6’). 

Since the quaternion group has no element of order 8, 8 4 k. On the other 
hand, suppose 2 4 r. Then N = N(B,)/Bz is of the form A,BA2, where B has 
order r. Then by our induction hypothesis, 0(N) = 0(A:)o(B)w = o(A;,)ro, 
where w/r’, and so o(N) would be odd. But then S,; = B:, contrary to our 
assumption that S, is the quaternion group. Hence we must have 2'r. Since 
k/r is a power of 2 and 8/¢ R, it follows that r/2 is odd and k/r is 2. 

We are thus reduced to considering the following situation: 


(5.1) b’'ab-" = a, lr, r/2isodd, and k/r = 2. 


Let (A — 1,h4) = u. If u = 1, then N(B.) = B (since N(B:) is of the form 
{a"™ ‘b’a*™ *}). But then again S, would be cyclic. So we may assume that 
u> 1. 

We may further assume that no subgroup of A is normal in G, for the 
theorem follows easily by induction in this case. In particular, this implies, as 
in the proof of Theorem 1, that the order of A is relatively prime to its index. 

From (5.1),we have \? — 1 = 0 (mod hk), and hence u(A + 1) = 0 (mod hk). 
Thus b’a“b-’ = a® = a™, and similarly b’a~“b’ = a*. Now as we have already 
seen, 6” commutes with a”. But N(A) is generated by a and 6’, and hence 
a’™ = 1 is in the centre of N(A). Since the order of A is relatively prime to 
its index, it follows from Lemma | that the transfer of G into A maps the 
intersection of A with the center of its normalizer onto itself. Thus (a’”™) is 
mapped onto itself by the transfer map. Hence the kernel H of the transfer 
of G into A consists of all elements of the form a‘b’a~‘*™. Suppose now that 
x = a‘bia~‘*+™ is an element of order 2 in H. Thus 1 = x? = a‘b’/a~**™a‘b/a-**™, 
whence 
(5.2) b4a™b’ = a~™. 

Conjugating this relation by 6’ we obtain b’a~“b’ = a™. Thus b’a™b’b’a~™b/ 
= q~-“q* = 1, and so 

(5.3) a“b*4q-* = §-*/, 

Thus a?b74q—-2™ = 574, But (2,h) = 1, so that we must have a™b*4a~™ = 5*, 


Equation (5.3) now yields that 5’ = 1. Consequently k|4j; that is k/4 = (r/2)|7 
Thus x? = 1 implies that 


320 DANIEL GORENSTEIN AND I. N. HERSTEIN 


Suppose next that j = r/2. Then (5.2) becomes }#’a™b!* = a-™, so that 
bia™b-" = a-“b-" = a-™b’ since k = 2r; but now the element on the right- 
hand side of this relation is of order 2, while that on the left-hand side is 
not, which is a contradiction. Similarly, 7 = — $r is impossible. 

We have thus proved that if x = a‘b’a~**™ is an element of order 2 in H, 
then j = r, and so x is of the form b’a*". Since H is normal in G, and since 
b’a" is of order 2, b(b’a“)b— is of order 2 and is in HZ. It follows that b(b’a“)d— 
= b’a™ for some integer m, and hence ba“b-' = a™. Thus a” generates a 
normal subgroup of G, in contradiction to our present assumption that no 
subgroup of A is normal in G. This contradiction completes the proof of the 
theorem. 


REFERENCES 


1. D. Gorenstein, A Class of Frobenius Groups, Can. J. Math., 11 (1959), 39-47. 

2. I. N. Herstein and I. Kaplansky, Groups of Cyclic Length Three, project document No. 13, 
Summer Mathematical Conference, Bowdoin College, Brunswick, Maine (1957). 

3. H. Zassenhaus, The Theory of Groups, New York, 1949). 


Clark University 
and 
Cornell University 

















REFLECTIONS OF A MATHEMATICIAN 
By L. J. MorDELL, F-.R.s. 


This delightful book will be appreciated both by non-mathematicians 
(for whom it was written) and by mathematicians (who will find 
in it pleasant comment recalling their own experiences). Professor 
Mordell gave the substance of the book as a lecture at the University 
of Toronto in 1955, where it aroused considerable discussion. Topics 
dealt with are: What Is Mathematics?, The Making of a Mathema- 
tician, Difficulties in the Study of Mathematics, Difficulties Arising 
from Faulty Presentation, How Does a Mathematician Work?, Origin 
of Problems, Solution of Problems, The Use of Electronic Computers 
in Solving Problems, Memory in Mathematics, Mathematical Errors 
and Mistakes, The Element of Luck in Mathematics, Priority in 
Mathematics, The Aesthetic Aspect of Mathematics, Mathematical 
Schools, National Aspects of Mathematics, Estimates of Mathe- 
matics, In Retrospect. With frontispiece portrait of Professor Mordell. 


Published by the Canadian Mathematical Congress 


Chemistry Building, Ecole Polytechnique, 
McGill University Montréal 


(Also on sale at the University of Toronto Book Department ) 





Other Publications Sponsored by the Canadian 
Mathematical Congress 


The Theory of Distributions. By Israel Halperin. $1.50 
Based on lectures given by Laurent Schwartz at the Canadian 
Mathematical Congress in 1951. 
Trigonometric Series. By R. L. Jeffery. $2.50 
A survey of some of the main-line developments in trigonometric 
series. 
Proceedings of the First Canadian Mathematical Congress (1945 )— 
out of print. 
Proceedings of the Second Canadian Mathematical Congress (1949). 
$6.00 
Proceedings of the Third Canadian Mathematical Congress (1953)— 
not published. 
Proceedings of the Fourth Canadian Mathematical Congress (1957). 
$6.00 


UNIVERSITY OF TORONTO PRESS 











